[2024-12-13 03:52:28,809][03180] Saving configuration to ./train_dir/Ant/config.json... [2024-12-13 03:52:28,810][03180] Rollout worker 0 uses device cpu [2024-12-13 03:52:28,811][03180] Rollout worker 1 uses device cpu [2024-12-13 03:52:28,811][03180] Rollout worker 2 uses device cpu [2024-12-13 03:52:28,811][03180] Rollout worker 3 uses device cpu [2024-12-13 03:52:28,811][03180] Rollout worker 4 uses device cpu [2024-12-13 03:52:28,811][03180] Rollout worker 5 uses device cpu [2024-12-13 03:52:28,811][03180] Rollout worker 6 uses device cpu [2024-12-13 03:52:28,812][03180] Rollout worker 7 uses device cpu [2024-12-13 03:52:28,812][03180] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 [2024-12-13 03:52:28,900][03180] InferenceWorker_p0-w0: min num requests: 2 [2024-12-13 03:52:28,937][03180] Starting all processes... [2024-12-13 03:52:28,937][03180] Starting process learner_proc0 [2024-12-13 03:52:28,947][03180] Starting all processes... [2024-12-13 03:52:28,952][03180] Starting process inference_proc0-0 [2024-12-13 03:52:28,957][03180] Starting process rollout_proc0 [2024-12-13 03:52:28,963][03180] Starting process rollout_proc1 [2024-12-13 03:52:28,964][03180] Starting process rollout_proc2 [2024-12-13 03:52:28,964][03180] Starting process rollout_proc3 [2024-12-13 03:52:28,964][03180] Starting process rollout_proc4 [2024-12-13 03:52:28,964][03180] Starting process rollout_proc5 [2024-12-13 03:52:28,964][03180] Starting process rollout_proc6 [2024-12-13 03:52:28,964][03180] Starting process rollout_proc7 [2024-12-13 03:52:54,434][03180] Heartbeat connected on InferenceWorker_p0-w0 [2024-12-13 03:52:54,788][03228] Worker 2 uses CPU cores [0] [2024-12-13 03:52:54,896][03180] Heartbeat connected on RolloutWorker_w2 [2024-12-13 03:52:55,024][03229] Worker 3 uses CPU cores [1] [2024-12-13 03:52:55,146][03180] Heartbeat connected on RolloutWorker_w3 [2024-12-13 03:52:55,226][03231] Worker 4 uses CPU cores [0] [2024-12-13 03:52:55,235][03213] Setting fixed seed 1 [2024-12-13 03:52:55,237][03180] Heartbeat connected on Batcher_0 [2024-12-13 03:52:55,242][03213] Initializing actor-critic model on device cpu [2024-12-13 03:52:55,243][03213] RunningMeanStd input shape: (27,) [2024-12-13 03:52:55,246][03213] RunningMeanStd input shape: (1,) [2024-12-13 03:52:55,265][03227] Worker 0 uses CPU cores [0] [2024-12-13 03:52:55,322][03180] Heartbeat connected on RolloutWorker_w4 [2024-12-13 03:52:55,323][03180] Heartbeat connected on RolloutWorker_w0 [2024-12-13 03:52:55,349][03230] Worker 1 uses CPU cores [1] [2024-12-13 03:52:55,367][03234] Worker 7 uses CPU cores [1] [2024-12-13 03:52:55,369][03233] Worker 5 uses CPU cores [1] [2024-12-13 03:52:55,402][03180] Heartbeat connected on RolloutWorker_w1 [2024-12-13 03:52:55,403][03232] Worker 6 uses CPU cores [0] [2024-12-13 03:52:55,415][03180] Heartbeat connected on RolloutWorker_w5 [2024-12-13 03:52:55,415][03180] Heartbeat connected on RolloutWorker_w6 [2024-12-13 03:52:55,416][03180] Heartbeat connected on RolloutWorker_w7 [2024-12-13 03:52:55,537][03213] Created Actor Critic model with architecture: [2024-12-13 03:52:55,538][03213] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): MlpEncoder( (mlp_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=Tanh) (2): RecursiveScriptModule(original_name=Linear) (3): RecursiveScriptModule(original_name=Tanh) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=64, out_features=1, bias=True) (action_parameterization): ActionParameterizationContinuousNonAdaptiveStddev( (distribution_linear): Linear(in_features=64, out_features=8, bias=True) ) ) [2024-12-13 03:52:56,018][03213] Using optimizer [2024-12-13 03:53:01,790][03213] No checkpoints found [2024-12-13 03:53:01,790][03213] Did not load from checkpoint, starting from scratch! [2024-12-13 03:53:01,793][03213] Initialized policy 0 weights for model version 0 [2024-12-13 03:53:01,797][03213] LearnerWorker_p0 finished initialization! [2024-12-13 03:53:01,799][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000000_0.pth... [2024-12-13 03:53:01,802][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000000_0.pth... [2024-12-13 03:53:01,802][03180] Heartbeat connected on LearnerWorker_p0 [2024-12-13 03:53:01,809][03226] RunningMeanStd input shape: (27,) [2024-12-13 03:53:01,810][03226] RunningMeanStd input shape: (1,) [2024-12-13 03:53:01,962][03180] Inference worker 0-0 is ready! [2024-12-13 03:53:01,962][03180] All inference workers are ready! Signal rollout workers to start! [2024-12-13 03:53:03,294][03228] Decorrelating experience for 0 frames... [2024-12-13 03:53:03,293][03227] Decorrelating experience for 0 frames... [2024-12-13 03:53:03,298][03232] Decorrelating experience for 0 frames... [2024-12-13 03:53:03,302][03232] Decorrelating experience for 64 frames... [2024-12-13 03:53:03,310][03228] Decorrelating experience for 64 frames... [2024-12-13 03:53:03,296][03231] Decorrelating experience for 0 frames... [2024-12-13 03:53:03,313][03231] Decorrelating experience for 64 frames... [2024-12-13 03:53:03,312][03227] Decorrelating experience for 64 frames... [2024-12-13 03:53:03,627][03234] Decorrelating experience for 0 frames... [2024-12-13 03:53:03,650][03230] Decorrelating experience for 0 frames... [2024-12-13 03:53:03,657][03234] Decorrelating experience for 64 frames... [2024-12-13 03:53:03,659][03233] Decorrelating experience for 0 frames... [2024-12-13 03:53:03,658][03230] Decorrelating experience for 64 frames... [2024-12-13 03:53:03,669][03233] Decorrelating experience for 64 frames... [2024-12-13 03:53:03,688][03229] Decorrelating experience for 0 frames... [2024-12-13 03:53:03,701][03229] Decorrelating experience for 64 frames... [2024-12-13 03:53:03,789][03228] Decorrelating experience for 128 frames... [2024-12-13 03:53:03,794][03227] Decorrelating experience for 128 frames... [2024-12-13 03:53:03,809][03232] Decorrelating experience for 128 frames... [2024-12-13 03:53:03,828][03231] Decorrelating experience for 128 frames... [2024-12-13 03:53:04,150][03234] Decorrelating experience for 128 frames... [2024-12-13 03:53:04,159][03230] Decorrelating experience for 128 frames... [2024-12-13 03:53:04,167][03233] Decorrelating experience for 128 frames... [2024-12-13 03:53:04,182][03229] Decorrelating experience for 128 frames... [2024-12-13 03:53:04,523][03228] Decorrelating experience for 192 frames... [2024-12-13 03:53:04,519][03227] Decorrelating experience for 192 frames... [2024-12-13 03:53:04,547][03232] Decorrelating experience for 192 frames... [2024-12-13 03:53:04,554][03231] Decorrelating experience for 192 frames... [2024-12-13 03:53:04,913][03234] Decorrelating experience for 192 frames... [2024-12-13 03:53:04,937][03230] Decorrelating experience for 192 frames... [2024-12-13 03:53:04,959][03233] Decorrelating experience for 192 frames... [2024-12-13 03:53:04,976][03229] Decorrelating experience for 192 frames... [2024-12-13 03:53:05,371][03180] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-12-13 03:53:05,822][03228] Decorrelating experience for 256 frames... [2024-12-13 03:53:05,828][03227] Decorrelating experience for 256 frames... [2024-12-13 03:53:05,833][03232] Decorrelating experience for 256 frames... [2024-12-13 03:53:05,867][03231] Decorrelating experience for 256 frames... [2024-12-13 03:53:06,103][03234] Decorrelating experience for 256 frames... [2024-12-13 03:53:06,117][03230] Decorrelating experience for 256 frames... [2024-12-13 03:53:06,154][03233] Decorrelating experience for 256 frames... [2024-12-13 03:53:06,170][03229] Decorrelating experience for 256 frames... [2024-12-13 03:53:07,463][03227] Decorrelating experience for 320 frames... [2024-12-13 03:53:07,489][03228] Decorrelating experience for 320 frames... [2024-12-13 03:53:07,514][03234] Decorrelating experience for 320 frames... [2024-12-13 03:53:07,511][03231] Decorrelating experience for 320 frames... [2024-12-13 03:53:07,533][03230] Decorrelating experience for 320 frames... [2024-12-13 03:53:07,564][03232] Decorrelating experience for 320 frames... [2024-12-13 03:53:07,602][03233] Decorrelating experience for 320 frames... [2024-12-13 03:53:07,609][03229] Decorrelating experience for 320 frames... [2024-12-13 03:53:09,317][03234] Decorrelating experience for 384 frames... [2024-12-13 03:53:09,338][03230] Decorrelating experience for 384 frames... [2024-12-13 03:53:09,354][03227] Decorrelating experience for 384 frames... [2024-12-13 03:53:09,414][03228] Decorrelating experience for 384 frames... [2024-12-13 03:53:09,419][03231] Decorrelating experience for 384 frames... [2024-12-13 03:53:09,419][03229] Decorrelating experience for 384 frames... [2024-12-13 03:53:09,436][03233] Decorrelating experience for 384 frames... [2024-12-13 03:53:09,437][03232] Decorrelating experience for 384 frames... [2024-12-13 03:53:10,371][03180] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-12-13 03:53:10,373][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000000_0.pth... [2024-12-13 03:53:10,831][03227] Decorrelating experience for 448 frames... [2024-12-13 03:53:10,900][03230] Decorrelating experience for 448 frames... [2024-12-13 03:53:10,904][03234] Decorrelating experience for 448 frames... [2024-12-13 03:53:10,912][03228] Decorrelating experience for 448 frames... [2024-12-13 03:53:10,950][03231] Decorrelating experience for 448 frames... [2024-12-13 03:53:10,951][03232] Decorrelating experience for 448 frames... [2024-12-13 03:53:11,007][03229] Decorrelating experience for 448 frames... [2024-12-13 03:53:11,041][03233] Decorrelating experience for 448 frames... [2024-12-13 03:53:15,371][03180] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 167.6. Samples: 1676. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-12-13 03:53:15,372][03180] Avg episode reward: [(0, '-69.806')] [2024-12-13 03:53:20,371][03180] Fps is (10 sec: 409.6, 60 sec: 273.1, 300 sec: 273.1). Total num frames: 4096. Throughput: 0: 535.5. Samples: 8032. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 03:53:20,376][03180] Avg episode reward: [(0, '-110.081')] [2024-12-13 03:53:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 614.4, 300 sec: 614.4). Total num frames: 12288. Throughput: 0: 522.0. Samples: 10440. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 03:53:25,371][03180] Avg episode reward: [(0, '-144.185')] [2024-12-13 03:53:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000024_12288.pth... [2024-12-13 03:53:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 655.3, 300 sec: 655.3). Total num frames: 16384. Throughput: 0: 686.3. Samples: 17160. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 03:53:30,373][03180] Avg episode reward: [(0, '-157.962')] [2024-12-13 03:53:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 24576. Throughput: 0: 824.0. Samples: 24720. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 03:53:35,371][03180] Avg episode reward: [(0, '-139.505')] [2024-12-13 03:53:40,375][03180] Fps is (10 sec: 1228.6, 60 sec: 819.1, 300 sec: 819.1). Total num frames: 28672. Throughput: 0: 779.8. Samples: 27296. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 03:53:40,375][03180] Avg episode reward: [(0, '-122.756')] [2024-12-13 03:53:40,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000056_28672.pth... [2024-12-13 03:53:40,396][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000000_0.pth [2024-12-13 03:53:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 32768. Throughput: 0: 846.6. Samples: 33864. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 03:53:45,371][03180] Avg episode reward: [(0, '-143.981')] [2024-12-13 03:53:49,688][03226] Updated weights for policy 0, policy_version 80 (0.0015) [2024-12-13 03:53:50,371][03180] Fps is (10 sec: 1229.3, 60 sec: 910.2, 300 sec: 910.2). Total num frames: 40960. Throughput: 0: 919.9. Samples: 41396. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 03:53:50,371][03180] Avg episode reward: [(0, '-148.077')] [2024-12-13 03:53:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 901.1, 300 sec: 901.1). Total num frames: 45056. Throughput: 0: 982.4. Samples: 44208. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:53:55,371][03180] Avg episode reward: [(0, '-167.034')] [2024-12-13 03:53:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000088_45056.pth... [2024-12-13 03:53:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000024_12288.pth [2024-12-13 03:54:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 893.7, 300 sec: 893.7). Total num frames: 49152. Throughput: 0: 1084.1. Samples: 50460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:00,371][03180] Avg episode reward: [(0, '-245.627')] [2024-12-13 03:54:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 955.7, 300 sec: 955.7). Total num frames: 57344. Throughput: 0: 1115.3. Samples: 58220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:05,371][03180] Avg episode reward: [(0, '-331.541')] [2024-12-13 03:54:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 945.2). Total num frames: 61440. Throughput: 0: 1131.6. Samples: 61364. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:10,371][03180] Avg episode reward: [(0, '-375.758')] [2024-12-13 03:54:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000120_61440.pth... [2024-12-13 03:54:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000056_28672.pth [2024-12-13 03:54:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 936.2). Total num frames: 65536. Throughput: 0: 1114.0. Samples: 67288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:15,371][03180] Avg episode reward: [(0, '-374.028')] [2024-12-13 03:54:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 983.0). Total num frames: 73728. Throughput: 0: 1114.1. Samples: 74856. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 03:54:20,371][03180] Avg episode reward: [(0, '-282.199')] [2024-12-13 03:54:25,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 972.8). Total num frames: 77824. Throughput: 0: 1130.2. Samples: 78152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:25,377][03180] Avg episode reward: [(0, '-187.143')] [2024-12-13 03:54:25,389][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000152_77824.pth... [2024-12-13 03:54:25,398][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000088_45056.pth [2024-12-13 03:54:27,513][03226] Updated weights for policy 0, policy_version 160 (0.0010) [2024-12-13 03:54:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 963.8). Total num frames: 81920. Throughput: 0: 1112.4. Samples: 83924. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:30,371][03180] Avg episode reward: [(0, '-116.290')] [2024-12-13 03:54:35,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1001.2). Total num frames: 90112. Throughput: 0: 1114.8. Samples: 91560. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 03:54:35,374][03180] Avg episode reward: [(0, '-141.034')] [2024-12-13 03:54:40,377][03180] Fps is (10 sec: 1228.1, 60 sec: 1092.2, 300 sec: 991.6). Total num frames: 94208. Throughput: 0: 1127.6. Samples: 94956. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:40,382][03180] Avg episode reward: [(0, '-149.752')] [2024-12-13 03:54:40,399][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000184_94208.pth... [2024-12-13 03:54:40,412][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000120_61440.pth [2024-12-13 03:54:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 983.0). Total num frames: 98304. Throughput: 0: 1064.3. Samples: 98352. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 03:54:45,371][03180] Avg episode reward: [(0, '-141.962')] [2024-12-13 03:54:50,371][03180] Fps is (10 sec: 819.6, 60 sec: 1024.0, 300 sec: 975.2). Total num frames: 102400. Throughput: 0: 1037.0. Samples: 104884. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:50,371][03180] Avg episode reward: [(0, '-159.152')] [2024-12-13 03:54:50,372][03213] Saving new best policy, reward=-159.152! [2024-12-13 03:54:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1005.4). Total num frames: 110592. Throughput: 0: 1052.3. Samples: 108716. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:54:55,374][03180] Avg episode reward: [(0, '-79.687')] [2024-12-13 03:54:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000216_110592.pth... [2024-12-13 03:54:55,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000152_77824.pth [2024-12-13 03:54:55,391][03213] Saving new best policy, reward=-79.687! [2024-12-13 03:55:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 997.3). Total num frames: 114688. Throughput: 0: 1051.7. Samples: 114616. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 03:55:00,371][03180] Avg episode reward: [(0, '-76.180')] [2024-12-13 03:55:00,376][03213] Saving new best policy, reward=-76.180! [2024-12-13 03:55:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 989.9). Total num frames: 118784. Throughput: 0: 1032.6. Samples: 121324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:55:05,371][03180] Avg episode reward: [(0, '-149.042')] [2024-12-13 03:55:06,319][03226] Updated weights for policy 0, policy_version 240 (0.0010) [2024-12-13 03:55:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1015.8). Total num frames: 126976. Throughput: 0: 1042.8. Samples: 125076. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 03:55:10,371][03180] Avg episode reward: [(0, '-126.225')] [2024-12-13 03:55:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000248_126976.pth... [2024-12-13 03:55:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000184_94208.pth [2024-12-13 03:55:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1008.2). Total num frames: 131072. Throughput: 0: 1052.0. Samples: 131264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:55:15,371][03180] Avg episode reward: [(0, '-150.694')] [2024-12-13 03:55:20,372][03180] Fps is (10 sec: 819.1, 60 sec: 1024.0, 300 sec: 1001.2). Total num frames: 135168. Throughput: 0: 1026.2. Samples: 137740. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 03:55:20,373][03180] Avg episode reward: [(0, '-95.753')] [2024-12-13 03:55:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1024.0). Total num frames: 143360. Throughput: 0: 1032.6. Samples: 141416. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:55:25,371][03180] Avg episode reward: [(0, '-89.565')] [2024-12-13 03:55:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000280_143360.pth... [2024-12-13 03:55:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000216_110592.pth [2024-12-13 03:55:30,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1016.9). Total num frames: 147456. Throughput: 0: 1101.6. Samples: 147924. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 03:55:30,372][03180] Avg episode reward: [(0, '-139.620')] [2024-12-13 03:55:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1010.3). Total num frames: 151552. Throughput: 0: 1093.8. Samples: 154104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:55:35,371][03180] Avg episode reward: [(0, '-133.773')] [2024-12-13 03:55:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1030.6). Total num frames: 159744. Throughput: 0: 1094.6. Samples: 157972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:55:40,371][03180] Avg episode reward: [(0, '-48.551')] [2024-12-13 03:55:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000312_159744.pth... [2024-12-13 03:55:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000248_126976.pth [2024-12-13 03:55:40,383][03213] Saving new best policy, reward=-48.551! [2024-12-13 03:55:42,936][03226] Updated weights for policy 0, policy_version 320 (0.0010) [2024-12-13 03:55:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1024.0). Total num frames: 163840. Throughput: 0: 1114.2. Samples: 164756. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 03:55:45,371][03180] Avg episode reward: [(0, '-50.885')] [2024-12-13 03:55:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1017.8). Total num frames: 167936. Throughput: 0: 1093.3. Samples: 170524. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:55:50,371][03180] Avg episode reward: [(0, '-76.655')] [2024-12-13 03:55:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1036.0). Total num frames: 176128. Throughput: 0: 1095.6. Samples: 174380. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:55:55,371][03180] Avg episode reward: [(0, '-100.051')] [2024-12-13 03:55:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000344_176128.pth... [2024-12-13 03:55:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000280_143360.pth [2024-12-13 03:56:00,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1029.8). Total num frames: 180224. Throughput: 0: 1114.4. Samples: 181416. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:56:00,374][03180] Avg episode reward: [(0, '-79.512')] [2024-12-13 03:56:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1024.0). Total num frames: 184320. Throughput: 0: 1098.2. Samples: 187156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:56:05,371][03180] Avg episode reward: [(0, '-55.475')] [2024-12-13 03:56:10,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1040.6). Total num frames: 192512. Throughput: 0: 1099.9. Samples: 190912. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 03:56:10,371][03180] Avg episode reward: [(0, '-85.871')] [2024-12-13 03:56:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000376_192512.pth... [2024-12-13 03:56:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000312_159744.pth [2024-12-13 03:56:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1034.8). Total num frames: 196608. Throughput: 0: 1118.1. Samples: 198240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:56:15,371][03180] Avg episode reward: [(0, '-82.605')] [2024-12-13 03:56:20,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.3, 300 sec: 1029.2). Total num frames: 200704. Throughput: 0: 1097.4. Samples: 203488. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 03:56:20,373][03180] Avg episode reward: [(0, '-62.253')] [2024-12-13 03:56:21,137][03226] Updated weights for policy 0, policy_version 400 (0.0010) [2024-12-13 03:56:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1044.5). Total num frames: 208896. Throughput: 0: 1096.2. Samples: 207300. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 03:56:25,371][03180] Avg episode reward: [(0, '-112.306')] [2024-12-13 03:56:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000408_208896.pth... [2024-12-13 03:56:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000344_176128.pth [2024-12-13 03:56:30,373][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.2, 300 sec: 1039.0). Total num frames: 212992. Throughput: 0: 1117.2. Samples: 215032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:56:30,374][03180] Avg episode reward: [(0, '-64.900')] [2024-12-13 03:56:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1033.8). Total num frames: 217088. Throughput: 0: 1109.7. Samples: 220460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:56:35,371][03180] Avg episode reward: [(0, '-61.917')] [2024-12-13 03:56:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1047.8). Total num frames: 225280. Throughput: 0: 1105.4. Samples: 224124. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:56:40,371][03180] Avg episode reward: [(0, '-18.565')] [2024-12-13 03:56:40,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000440_225280.pth... [2024-12-13 03:56:40,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000376_192512.pth [2024-12-13 03:56:40,391][03213] Saving new best policy, reward=-18.565! [2024-12-13 03:56:45,372][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1042.6). Total num frames: 229376. Throughput: 0: 1064.4. Samples: 229312. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 03:56:45,372][03180] Avg episode reward: [(0, '-30.322')] [2024-12-13 03:56:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1037.7). Total num frames: 233472. Throughput: 0: 1055.2. Samples: 234640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:56:50,372][03180] Avg episode reward: [(0, '-31.685')] [2024-12-13 03:56:55,371][03180] Fps is (10 sec: 819.3, 60 sec: 1024.0, 300 sec: 1032.9). Total num frames: 237568. Throughput: 0: 1047.1. Samples: 238032. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 03:56:55,371][03180] Avg episode reward: [(0, '-23.643')] [2024-12-13 03:56:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000464_237568.pth... [2024-12-13 03:56:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000408_208896.pth [2024-12-13 03:56:59,072][03226] Updated weights for policy 0, policy_version 480 (0.0012) [2024-12-13 03:57:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1045.8). Total num frames: 245760. Throughput: 0: 1056.0. Samples: 245760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:57:00,372][03180] Avg episode reward: [(0, '-19.890')] [2024-12-13 03:57:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1041.1). Total num frames: 249856. Throughput: 0: 1066.3. Samples: 251468. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 03:57:05,371][03180] Avg episode reward: [(0, '-50.292')] [2024-12-13 03:57:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1036.5). Total num frames: 253952. Throughput: 0: 1052.2. Samples: 254648. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 03:57:10,371][03180] Avg episode reward: [(0, '-74.175')] [2024-12-13 03:57:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000496_253952.pth... [2024-12-13 03:57:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000440_225280.pth [2024-12-13 03:57:15,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.2, 300 sec: 1048.6). Total num frames: 262144. Throughput: 0: 1048.9. Samples: 262232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:57:15,375][03180] Avg episode reward: [(0, '-61.682')] [2024-12-13 03:57:20,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.3, 300 sec: 1044.1). Total num frames: 266240. Throughput: 0: 1067.4. Samples: 268496. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:57:20,373][03180] Avg episode reward: [(0, '-49.419')] [2024-12-13 03:57:25,371][03180] Fps is (10 sec: 819.5, 60 sec: 1024.0, 300 sec: 1039.8). Total num frames: 270336. Throughput: 0: 1048.9. Samples: 271324. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 03:57:25,371][03180] Avg episode reward: [(0, '-48.302')] [2024-12-13 03:57:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000528_270336.pth... [2024-12-13 03:57:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000464_237568.pth [2024-12-13 03:57:30,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1051.0). Total num frames: 278528. Throughput: 0: 1101.8. Samples: 278892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:57:30,371][03180] Avg episode reward: [(0, '-47.744')] [2024-12-13 03:57:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1046.8). Total num frames: 282624. Throughput: 0: 1131.6. Samples: 285564. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 03:57:35,372][03180] Avg episode reward: [(0, '-47.237')] [2024-12-13 03:57:36,597][03226] Updated weights for policy 0, policy_version 560 (0.0009) [2024-12-13 03:57:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1057.5). Total num frames: 290816. Throughput: 0: 1111.6. Samples: 288052. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 03:57:40,371][03180] Avg episode reward: [(0, '-50.469')] [2024-12-13 03:57:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000568_290816.pth... [2024-12-13 03:57:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000496_253952.pth [2024-12-13 03:57:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1053.3). Total num frames: 294912. Throughput: 0: 1112.4. Samples: 295820. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:57:45,371][03180] Avg episode reward: [(0, '-43.419')] [2024-12-13 03:57:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1049.1). Total num frames: 299008. Throughput: 0: 1135.4. Samples: 302560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:57:50,372][03180] Avg episode reward: [(0, '-32.969')] [2024-12-13 03:57:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1059.3). Total num frames: 307200. Throughput: 0: 1117.9. Samples: 304952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:57:55,371][03180] Avg episode reward: [(0, '-23.013')] [2024-12-13 03:57:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000600_307200.pth... [2024-12-13 03:57:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000528_270336.pth [2024-12-13 03:58:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1055.2). Total num frames: 311296. Throughput: 0: 1110.1. Samples: 312184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:58:00,371][03180] Avg episode reward: [(0, '-16.761')] [2024-12-13 03:58:00,372][03213] Saving new best policy, reward=-16.761! [2024-12-13 03:58:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1069.1). Total num frames: 315392. Throughput: 0: 1124.3. Samples: 319088. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 03:58:05,371][03180] Avg episode reward: [(0, '-8.702')] [2024-12-13 03:58:05,372][03213] Saving new best policy, reward=-8.702! [2024-12-13 03:58:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 323584. Throughput: 0: 1114.1. Samples: 321460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:58:10,371][03180] Avg episode reward: [(0, '-2.656')] [2024-12-13 03:58:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000632_323584.pth... [2024-12-13 03:58:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000568_290816.pth [2024-12-13 03:58:10,392][03213] Saving new best policy, reward=-2.656! [2024-12-13 03:58:13,621][03226] Updated weights for policy 0, policy_version 640 (0.0011) [2024-12-13 03:58:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 327680. Throughput: 0: 1095.2. Samples: 328176. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:58:15,372][03180] Avg episode reward: [(0, '2.420')] [2024-12-13 03:58:15,372][03213] Saving new best policy, reward=2.420! [2024-12-13 03:58:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 331776. Throughput: 0: 1111.2. Samples: 335568. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 03:58:20,371][03180] Avg episode reward: [(0, '10.214')] [2024-12-13 03:58:20,373][03213] Saving new best policy, reward=10.214! [2024-12-13 03:58:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 335872. Throughput: 0: 1110.7. Samples: 338032. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 03:58:25,372][03180] Avg episode reward: [(0, '15.123')] [2024-12-13 03:58:25,402][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000664_339968.pth... [2024-12-13 03:58:25,407][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000600_307200.pth [2024-12-13 03:58:25,408][03213] Saving new best policy, reward=15.123! [2024-12-13 03:58:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 344064. Throughput: 0: 1083.2. Samples: 344564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:58:30,371][03180] Avg episode reward: [(0, '28.568')] [2024-12-13 03:58:30,372][03213] Saving new best policy, reward=28.568! [2024-12-13 03:58:35,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 352256. Throughput: 0: 1104.4. Samples: 352256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:58:35,371][03180] Avg episode reward: [(0, '41.459')] [2024-12-13 03:58:35,372][03213] Saving new best policy, reward=41.459! [2024-12-13 03:58:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 356352. Throughput: 0: 1109.2. Samples: 354868. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 03:58:40,371][03180] Avg episode reward: [(0, '47.287')] [2024-12-13 03:58:40,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000696_356352.pth... [2024-12-13 03:58:40,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000632_323584.pth [2024-12-13 03:58:40,394][03213] Saving new best policy, reward=47.287! [2024-12-13 03:58:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 360448. Throughput: 0: 1088.2. Samples: 361152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:58:45,371][03180] Avg episode reward: [(0, '56.537')] [2024-12-13 03:58:45,372][03213] Saving new best policy, reward=56.537! [2024-12-13 03:58:50,013][03226] Updated weights for policy 0, policy_version 720 (0.0010) [2024-12-13 03:58:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 368640. Throughput: 0: 1102.3. Samples: 368692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:58:50,371][03180] Avg episode reward: [(0, '76.562')] [2024-12-13 03:58:50,372][03213] Saving new best policy, reward=76.562! [2024-12-13 03:58:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 372736. Throughput: 0: 1115.4. Samples: 371652. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:58:55,371][03180] Avg episode reward: [(0, '86.884')] [2024-12-13 03:58:55,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000728_372736.pth... [2024-12-13 03:58:55,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000664_339968.pth [2024-12-13 03:58:55,388][03213] Saving new best policy, reward=86.884! [2024-12-13 03:59:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 376832. Throughput: 0: 1099.2. Samples: 377640. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 03:59:00,371][03180] Avg episode reward: [(0, '102.489')] [2024-12-13 03:59:00,372][03213] Saving new best policy, reward=102.489! [2024-12-13 03:59:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 385024. Throughput: 0: 1104.8. Samples: 385284. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:59:05,371][03180] Avg episode reward: [(0, '119.850')] [2024-12-13 03:59:05,372][03213] Saving new best policy, reward=119.850! [2024-12-13 03:59:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 389120. Throughput: 0: 1121.1. Samples: 388480. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:59:10,371][03180] Avg episode reward: [(0, '128.130')] [2024-12-13 03:59:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000760_389120.pth... [2024-12-13 03:59:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000696_356352.pth [2024-12-13 03:59:10,392][03213] Saving new best policy, reward=128.130! [2024-12-13 03:59:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 393216. Throughput: 0: 1101.2. Samples: 394116. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 03:59:15,372][03180] Avg episode reward: [(0, '137.054')] [2024-12-13 03:59:15,372][03213] Saving new best policy, reward=137.054! [2024-12-13 03:59:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 401408. Throughput: 0: 1093.8. Samples: 401476. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 03:59:20,371][03180] Avg episode reward: [(0, '155.669')] [2024-12-13 03:59:20,372][03213] Saving new best policy, reward=155.669! [2024-12-13 03:59:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 405504. Throughput: 0: 1114.6. Samples: 405024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:59:25,371][03180] Avg episode reward: [(0, '164.996')] [2024-12-13 03:59:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000792_405504.pth... [2024-12-13 03:59:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000728_372736.pth [2024-12-13 03:59:25,383][03213] Saving new best policy, reward=164.996! [2024-12-13 03:59:28,431][03226] Updated weights for policy 0, policy_version 800 (0.0018) [2024-12-13 03:59:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 409600. Throughput: 0: 1093.2. Samples: 410348. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:59:30,371][03180] Avg episode reward: [(0, '177.565')] [2024-12-13 03:59:30,372][03213] Saving new best policy, reward=177.565! [2024-12-13 03:59:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 417792. Throughput: 0: 1092.5. Samples: 417856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:59:35,371][03180] Avg episode reward: [(0, '190.615')] [2024-12-13 03:59:35,372][03213] Saving new best policy, reward=190.615! [2024-12-13 03:59:40,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 421888. Throughput: 0: 1111.7. Samples: 421684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:59:40,375][03180] Avg episode reward: [(0, '202.484')] [2024-12-13 03:59:40,387][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000824_421888.pth... [2024-12-13 03:59:40,394][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000760_389120.pth [2024-12-13 03:59:40,395][03213] Saving new best policy, reward=202.484! [2024-12-13 03:59:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 425984. Throughput: 0: 1088.7. Samples: 426632. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 03:59:45,371][03180] Avg episode reward: [(0, '217.770')] [2024-12-13 03:59:45,372][03213] Saving new best policy, reward=217.770! [2024-12-13 03:59:50,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 434176. Throughput: 0: 1085.5. Samples: 434132. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 03:59:50,371][03180] Avg episode reward: [(0, '240.537')] [2024-12-13 03:59:50,372][03213] Saving new best policy, reward=240.537! [2024-12-13 03:59:55,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 438272. Throughput: 0: 1098.6. Samples: 437920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 03:59:55,373][03180] Avg episode reward: [(0, '264.524')] [2024-12-13 03:59:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000856_438272.pth... [2024-12-13 03:59:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000792_405504.pth [2024-12-13 03:59:55,387][03213] Saving new best policy, reward=264.524! [2024-12-13 04:00:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 442368. Throughput: 0: 1080.3. Samples: 442728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:00:00,371][03180] Avg episode reward: [(0, '291.553')] [2024-12-13 04:00:00,373][03213] Saving new best policy, reward=291.553! [2024-12-13 04:00:05,372][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 446464. Throughput: 0: 1079.9. Samples: 450072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:00:05,372][03180] Avg episode reward: [(0, '318.815')] [2024-12-13 04:00:05,373][03213] Saving new best policy, reward=318.815! [2024-12-13 04:00:05,509][03226] Updated weights for policy 0, policy_version 880 (0.0013) [2024-12-13 04:00:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 454656. Throughput: 0: 1082.1. Samples: 453720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:00:10,372][03180] Avg episode reward: [(0, '328.627')] [2024-12-13 04:00:10,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000888_454656.pth... [2024-12-13 04:00:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000824_421888.pth [2024-12-13 04:00:10,391][03213] Saving new best policy, reward=328.627! [2024-12-13 04:00:15,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 458752. Throughput: 0: 1079.2. Samples: 458912. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:00:15,372][03180] Avg episode reward: [(0, '340.944')] [2024-12-13 04:00:15,373][03213] Saving new best policy, reward=340.944! [2024-12-13 04:00:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 462848. Throughput: 0: 1064.6. Samples: 465764. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:00:20,371][03180] Avg episode reward: [(0, '376.043')] [2024-12-13 04:00:20,374][03213] Saving new best policy, reward=376.043! [2024-12-13 04:00:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 466944. Throughput: 0: 1032.3. Samples: 468136. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:00:25,371][03180] Avg episode reward: [(0, '379.804')] [2024-12-13 04:00:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000912_466944.pth... [2024-12-13 04:00:25,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000856_438272.pth [2024-12-13 04:00:25,396][03213] Saving new best policy, reward=379.804! [2024-12-13 04:00:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 471040. Throughput: 0: 1024.3. Samples: 472724. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:00:30,371][03180] Avg episode reward: [(0, '389.603')] [2024-12-13 04:00:30,372][03213] Saving new best policy, reward=389.603! [2024-12-13 04:00:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 479232. Throughput: 0: 1001.1. Samples: 479180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:00:35,371][03180] Avg episode reward: [(0, '406.995')] [2024-12-13 04:00:35,372][03213] Saving new best policy, reward=406.995! [2024-12-13 04:00:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.1, 300 sec: 1083.0). Total num frames: 483328. Throughput: 0: 1001.6. Samples: 482992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:00:40,374][03180] Avg episode reward: [(0, '428.491')] [2024-12-13 04:00:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000944_483328.pth... [2024-12-13 04:00:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000888_454656.pth [2024-12-13 04:00:40,386][03213] Saving new best policy, reward=428.491! [2024-12-13 04:00:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 487424. Throughput: 0: 1033.4. Samples: 489232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:00:45,371][03180] Avg episode reward: [(0, '437.822')] [2024-12-13 04:00:45,375][03213] Saving new best policy, reward=437.822! [2024-12-13 04:00:46,841][03226] Updated weights for policy 0, policy_version 960 (0.0011) [2024-12-13 04:00:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 495616. Throughput: 0: 1007.7. Samples: 495416. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:00:50,371][03180] Avg episode reward: [(0, '454.640')] [2024-12-13 04:00:50,372][03213] Saving new best policy, reward=454.640! [2024-12-13 04:00:55,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 499712. Throughput: 0: 1009.9. Samples: 499168. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:00:55,374][03180] Avg episode reward: [(0, '469.741')] [2024-12-13 04:00:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000000976_499712.pth... [2024-12-13 04:00:55,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000912_466944.pth [2024-12-13 04:00:55,394][03213] Saving new best policy, reward=469.741! [2024-12-13 04:01:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 503808. Throughput: 0: 1037.9. Samples: 505616. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:01:00,371][03180] Avg episode reward: [(0, '482.864')] [2024-12-13 04:01:00,372][03213] Saving new best policy, reward=482.864! [2024-12-13 04:01:05,372][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 512000. Throughput: 0: 1024.1. Samples: 511852. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:01:05,373][03180] Avg episode reward: [(0, '492.323')] [2024-12-13 04:01:05,374][03213] Saving new best policy, reward=492.323! [2024-12-13 04:01:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 516096. Throughput: 0: 1057.8. Samples: 515736. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:01:10,371][03180] Avg episode reward: [(0, '496.524')] [2024-12-13 04:01:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001008_516096.pth... [2024-12-13 04:01:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000944_483328.pth [2024-12-13 04:01:10,382][03213] Saving new best policy, reward=496.524! [2024-12-13 04:01:15,371][03180] Fps is (10 sec: 819.3, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 520192. Throughput: 0: 1109.6. Samples: 522656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:01:15,371][03180] Avg episode reward: [(0, '519.512')] [2024-12-13 04:01:15,372][03213] Saving new best policy, reward=519.512! [2024-12-13 04:01:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 528384. Throughput: 0: 1095.4. Samples: 528472. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:01:20,371][03180] Avg episode reward: [(0, '533.405')] [2024-12-13 04:01:20,372][03213] Saving new best policy, reward=533.405! [2024-12-13 04:01:23,129][03226] Updated weights for policy 0, policy_version 1040 (0.0015) [2024-12-13 04:01:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 532480. Throughput: 0: 1098.9. Samples: 532444. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:01:25,372][03180] Avg episode reward: [(0, '547.408')] [2024-12-13 04:01:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001040_532480.pth... [2024-12-13 04:01:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000000976_499712.pth [2024-12-13 04:01:25,384][03213] Saving new best policy, reward=547.408! [2024-12-13 04:01:30,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1083.0). Total num frames: 536576. Throughput: 0: 1118.7. Samples: 539576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:01:30,374][03180] Avg episode reward: [(0, '564.850')] [2024-12-13 04:01:30,375][03213] Saving new best policy, reward=564.850! [2024-12-13 04:01:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 544768. Throughput: 0: 1105.4. Samples: 545160. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:01:35,371][03180] Avg episode reward: [(0, '568.886')] [2024-12-13 04:01:35,372][03213] Saving new best policy, reward=568.886! [2024-12-13 04:01:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 548864. Throughput: 0: 1105.6. Samples: 548920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:01:40,371][03180] Avg episode reward: [(0, '598.525')] [2024-12-13 04:01:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001072_548864.pth... [2024-12-13 04:01:40,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001008_516096.pth [2024-12-13 04:01:40,392][03213] Saving new best policy, reward=598.525! [2024-12-13 04:01:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 557056. Throughput: 0: 1126.9. Samples: 556328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:01:45,371][03180] Avg episode reward: [(0, '620.235')] [2024-12-13 04:01:45,372][03213] Saving new best policy, reward=620.235! [2024-12-13 04:01:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 561152. Throughput: 0: 1104.2. Samples: 561540. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:01:50,371][03180] Avg episode reward: [(0, '634.184')] [2024-12-13 04:01:50,372][03213] Saving new best policy, reward=634.184! [2024-12-13 04:01:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 565248. Throughput: 0: 1101.8. Samples: 565316. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:01:55,371][03180] Avg episode reward: [(0, '633.590')] [2024-12-13 04:01:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001104_565248.pth... [2024-12-13 04:01:55,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001040_532480.pth [2024-12-13 04:01:59,400][03226] Updated weights for policy 0, policy_version 1120 (0.0012) [2024-12-13 04:02:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 573440. Throughput: 0: 1122.2. Samples: 573156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:02:00,371][03180] Avg episode reward: [(0, '655.412')] [2024-12-13 04:02:00,372][03213] Saving new best policy, reward=655.412! [2024-12-13 04:02:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 577536. Throughput: 0: 1105.0. Samples: 578196. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:02:05,371][03180] Avg episode reward: [(0, '663.205')] [2024-12-13 04:02:05,372][03213] Saving new best policy, reward=663.205! [2024-12-13 04:02:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 581632. Throughput: 0: 1100.7. Samples: 581976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:02:10,371][03180] Avg episode reward: [(0, '673.053')] [2024-12-13 04:02:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001136_581632.pth... [2024-12-13 04:02:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001072_548864.pth [2024-12-13 04:02:10,384][03213] Saving new best policy, reward=673.053! [2024-12-13 04:02:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 589824. Throughput: 0: 1115.3. Samples: 589764. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:02:15,371][03180] Avg episode reward: [(0, '682.600')] [2024-12-13 04:02:15,372][03213] Saving new best policy, reward=682.600! [2024-12-13 04:02:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 593920. Throughput: 0: 1112.6. Samples: 595228. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:02:20,372][03180] Avg episode reward: [(0, '705.328')] [2024-12-13 04:02:20,373][03213] Saving new best policy, reward=705.328! [2024-12-13 04:02:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 598016. Throughput: 0: 1103.7. Samples: 598588. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:02:25,371][03180] Avg episode reward: [(0, '714.066')] [2024-12-13 04:02:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001168_598016.pth... [2024-12-13 04:02:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001104_565248.pth [2024-12-13 04:02:25,385][03213] Saving new best policy, reward=714.066! [2024-12-13 04:02:30,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.6, 300 sec: 1096.9). Total num frames: 606208. Throughput: 0: 1109.7. Samples: 606264. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:02:30,371][03180] Avg episode reward: [(0, '725.860')] [2024-12-13 04:02:30,372][03213] Saving new best policy, reward=725.860! [2024-12-13 04:02:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 610304. Throughput: 0: 1125.0. Samples: 612164. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:02:35,371][03180] Avg episode reward: [(0, '741.574')] [2024-12-13 04:02:35,372][03213] Saving new best policy, reward=741.574! [2024-12-13 04:02:37,385][03226] Updated weights for policy 0, policy_version 1200 (0.0013) [2024-12-13 04:02:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 614400. Throughput: 0: 1109.2. Samples: 615228. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:02:40,371][03180] Avg episode reward: [(0, '754.434')] [2024-12-13 04:02:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001200_614400.pth... [2024-12-13 04:02:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001136_581632.pth [2024-12-13 04:02:40,385][03213] Saving new best policy, reward=754.434! [2024-12-13 04:02:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 622592. Throughput: 0: 1104.8. Samples: 622872. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:02:45,371][03180] Avg episode reward: [(0, '758.122')] [2024-12-13 04:02:45,372][03213] Saving new best policy, reward=758.122! [2024-12-13 04:02:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 626688. Throughput: 0: 1134.6. Samples: 629252. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:02:50,371][03180] Avg episode reward: [(0, '756.263')] [2024-12-13 04:02:55,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 634880. Throughput: 0: 1109.3. Samples: 631896. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:02:55,373][03180] Avg episode reward: [(0, '744.448')] [2024-12-13 04:02:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001240_634880.pth... [2024-12-13 04:02:55,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001168_598016.pth [2024-12-13 04:03:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 638976. Throughput: 0: 1106.2. Samples: 639544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:03:00,371][03180] Avg episode reward: [(0, '761.852')] [2024-12-13 04:03:00,372][03213] Saving new best policy, reward=761.852! [2024-12-13 04:03:05,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 643072. Throughput: 0: 1135.5. Samples: 646324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:03:05,372][03180] Avg episode reward: [(0, '756.296')] [2024-12-13 04:03:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 651264. Throughput: 0: 1113.6. Samples: 648700. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:03:10,371][03180] Avg episode reward: [(0, '767.037')] [2024-12-13 04:03:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001272_651264.pth... [2024-12-13 04:03:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001200_614400.pth [2024-12-13 04:03:10,384][03213] Saving new best policy, reward=767.037! [2024-12-13 04:03:13,451][03226] Updated weights for policy 0, policy_version 1280 (0.0020) [2024-12-13 04:03:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 655360. Throughput: 0: 1107.5. Samples: 656100. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:03:15,371][03180] Avg episode reward: [(0, '808.316')] [2024-12-13 04:03:15,372][03213] Saving new best policy, reward=808.316! [2024-12-13 04:03:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 659456. Throughput: 0: 1133.1. Samples: 663152. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:03:20,371][03180] Avg episode reward: [(0, '827.402')] [2024-12-13 04:03:20,372][03213] Saving new best policy, reward=827.402! [2024-12-13 04:03:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 667648. Throughput: 0: 1118.4. Samples: 665556. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:03:25,371][03180] Avg episode reward: [(0, '820.118')] [2024-12-13 04:03:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001304_667648.pth... [2024-12-13 04:03:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001240_634880.pth [2024-12-13 04:03:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 671744. Throughput: 0: 1103.8. Samples: 672544. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:03:30,371][03180] Avg episode reward: [(0, '831.313')] [2024-12-13 04:03:30,372][03213] Saving new best policy, reward=831.313! [2024-12-13 04:03:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 675840. Throughput: 0: 1122.4. Samples: 679760. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:03:35,371][03180] Avg episode reward: [(0, '850.235')] [2024-12-13 04:03:35,401][03213] Saving new best policy, reward=850.235! [2024-12-13 04:03:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 684032. Throughput: 0: 1118.1. Samples: 682208. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:03:40,371][03180] Avg episode reward: [(0, '869.806')] [2024-12-13 04:03:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001336_684032.pth... [2024-12-13 04:03:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001272_651264.pth [2024-12-13 04:03:40,382][03213] Saving new best policy, reward=869.806! [2024-12-13 04:03:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 688128. Throughput: 0: 1098.8. Samples: 688988. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:03:45,371][03180] Avg episode reward: [(0, '887.232')] [2024-12-13 04:03:45,372][03213] Saving new best policy, reward=887.232! [2024-12-13 04:03:49,956][03226] Updated weights for policy 0, policy_version 1360 (0.0012) [2024-12-13 04:03:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 696320. Throughput: 0: 1112.9. Samples: 696404. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:03:50,371][03180] Avg episode reward: [(0, '894.604')] [2024-12-13 04:03:50,372][03213] Saving new best policy, reward=894.604! [2024-12-13 04:03:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 700416. Throughput: 0: 1116.9. Samples: 698960. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:03:55,371][03180] Avg episode reward: [(0, '892.023')] [2024-12-13 04:03:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001368_700416.pth... [2024-12-13 04:03:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001304_667648.pth [2024-12-13 04:04:00,377][03180] Fps is (10 sec: 818.7, 60 sec: 1092.1, 300 sec: 1083.0). Total num frames: 704512. Throughput: 0: 1097.6. Samples: 705500. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:04:00,389][03180] Avg episode reward: [(0, '905.081')] [2024-12-13 04:04:00,390][03213] Saving new best policy, reward=905.081! [2024-12-13 04:04:05,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1083.0). Total num frames: 708608. Throughput: 0: 1062.1. Samples: 710948. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:04:05,372][03180] Avg episode reward: [(0, '926.723')] [2024-12-13 04:04:05,375][03213] Saving new best policy, reward=926.723! [2024-12-13 04:04:10,371][03180] Fps is (10 sec: 819.7, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 712704. Throughput: 0: 1059.6. Samples: 713240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:04:10,371][03180] Avg episode reward: [(0, '936.753')] [2024-12-13 04:04:10,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001392_712704.pth... [2024-12-13 04:04:10,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001336_684032.pth [2024-12-13 04:04:10,395][03213] Saving new best policy, reward=936.753! [2024-12-13 04:04:15,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 720896. Throughput: 0: 1044.7. Samples: 719556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:04:15,371][03180] Avg episode reward: [(0, '960.333')] [2024-12-13 04:04:15,372][03213] Saving new best policy, reward=960.333! [2024-12-13 04:04:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 724992. Throughput: 0: 1053.1. Samples: 727148. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:04:20,371][03180] Avg episode reward: [(0, '983.666')] [2024-12-13 04:04:20,372][03213] Saving new best policy, reward=983.666! [2024-12-13 04:04:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 729088. Throughput: 0: 1061.2. Samples: 729960. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:04:25,371][03180] Avg episode reward: [(0, '977.273')] [2024-12-13 04:04:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001424_729088.pth... [2024-12-13 04:04:25,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001368_700416.pth [2024-12-13 04:04:30,082][03226] Updated weights for policy 0, policy_version 1440 (0.0013) [2024-12-13 04:04:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 737280. Throughput: 0: 1044.1. Samples: 735972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:04:30,371][03180] Avg episode reward: [(0, '949.679')] [2024-12-13 04:04:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 741376. Throughput: 0: 1046.2. Samples: 743484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:04:35,372][03180] Avg episode reward: [(0, '939.367')] [2024-12-13 04:04:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 745472. Throughput: 0: 1061.2. Samples: 746716. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:04:40,371][03180] Avg episode reward: [(0, '944.362')] [2024-12-13 04:04:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001456_745472.pth... [2024-12-13 04:04:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001392_712704.pth [2024-12-13 04:04:45,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 753664. Throughput: 0: 1042.7. Samples: 752416. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:04:45,373][03180] Avg episode reward: [(0, '945.458')] [2024-12-13 04:04:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 757760. Throughput: 0: 1081.4. Samples: 759608. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:04:50,372][03180] Avg episode reward: [(0, '954.576')] [2024-12-13 04:04:55,374][03180] Fps is (10 sec: 818.9, 60 sec: 1023.9, 300 sec: 1083.0). Total num frames: 761856. Throughput: 0: 1090.6. Samples: 762320. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:04:55,375][03180] Avg episode reward: [(0, '947.485')] [2024-12-13 04:04:55,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001488_761856.pth... [2024-12-13 04:04:55,409][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001424_729088.pth [2024-12-13 04:05:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.1, 300 sec: 1083.0). Total num frames: 765952. Throughput: 0: 1036.4. Samples: 766192. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:05:00,371][03180] Avg episode reward: [(0, '953.870')] [2024-12-13 04:05:05,371][03180] Fps is (10 sec: 819.5, 60 sec: 1024.0, 300 sec: 1069.1). Total num frames: 770048. Throughput: 0: 1035.6. Samples: 773752. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:05:05,371][03180] Avg episode reward: [(0, '958.970')] [2024-12-13 04:05:08,764][03226] Updated weights for policy 0, policy_version 1520 (0.0013) [2024-12-13 04:05:10,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.2, 300 sec: 1083.0). Total num frames: 778240. Throughput: 0: 1056.0. Samples: 777484. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:05:10,375][03180] Avg episode reward: [(0, '962.392')] [2024-12-13 04:05:10,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001520_778240.pth... [2024-12-13 04:05:10,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001456_745472.pth [2024-12-13 04:05:15,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 782336. Throughput: 0: 1038.3. Samples: 782700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:05:15,373][03180] Avg episode reward: [(0, '956.609')] [2024-12-13 04:05:20,371][03180] Fps is (10 sec: 819.5, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 786432. Throughput: 0: 1032.1. Samples: 789928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:05:20,371][03180] Avg episode reward: [(0, '965.618')] [2024-12-13 04:05:25,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 794624. Throughput: 0: 1040.4. Samples: 793532. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:05:25,371][03180] Avg episode reward: [(0, '962.561')] [2024-12-13 04:05:25,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001552_794624.pth... [2024-12-13 04:05:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001488_761856.pth [2024-12-13 04:05:30,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 798720. Throughput: 0: 1039.8. Samples: 799208. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:05:30,373][03180] Avg episode reward: [(0, '958.487')] [2024-12-13 04:05:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 802816. Throughput: 0: 1036.2. Samples: 806236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:05:35,371][03180] Avg episode reward: [(0, '981.383')] [2024-12-13 04:05:40,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 811008. Throughput: 0: 1058.4. Samples: 809944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:05:40,371][03180] Avg episode reward: [(0, '1008.010')] [2024-12-13 04:05:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001584_811008.pth... [2024-12-13 04:05:40,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001520_778240.pth [2024-12-13 04:05:40,384][03213] Saving new best policy, reward=1008.010! [2024-12-13 04:05:45,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 815104. Throughput: 0: 1108.6. Samples: 816084. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:05:45,375][03180] Avg episode reward: [(0, '998.350')] [2024-12-13 04:05:47,348][03226] Updated weights for policy 0, policy_version 1600 (0.0011) [2024-12-13 04:05:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 819200. Throughput: 0: 1087.3. Samples: 822680. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:05:50,371][03180] Avg episode reward: [(0, '1023.319')] [2024-12-13 04:05:50,372][03213] Saving new best policy, reward=1023.319! [2024-12-13 04:05:55,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 827392. Throughput: 0: 1087.3. Samples: 826408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:05:55,371][03180] Avg episode reward: [(0, '1045.584')] [2024-12-13 04:05:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001616_827392.pth... [2024-12-13 04:05:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001552_794624.pth [2024-12-13 04:05:55,384][03213] Saving new best policy, reward=1045.584! [2024-12-13 04:06:00,376][03180] Fps is (10 sec: 1228.1, 60 sec: 1092.2, 300 sec: 1083.0). Total num frames: 831488. Throughput: 0: 1112.0. Samples: 832744. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:06:00,377][03180] Avg episode reward: [(0, '1070.880')] [2024-12-13 04:06:00,378][03213] Saving new best policy, reward=1070.880! [2024-12-13 04:06:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 835584. Throughput: 0: 1089.4. Samples: 838952. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:06:05,371][03180] Avg episode reward: [(0, '1070.781')] [2024-12-13 04:06:10,371][03180] Fps is (10 sec: 1229.5, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 843776. Throughput: 0: 1093.9. Samples: 842756. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:06:10,371][03180] Avg episode reward: [(0, '1096.100')] [2024-12-13 04:06:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001648_843776.pth... [2024-12-13 04:06:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001584_811008.pth [2024-12-13 04:06:10,385][03213] Saving new best policy, reward=1096.100! [2024-12-13 04:06:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 847872. Throughput: 0: 1120.8. Samples: 849644. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:06:15,371][03180] Avg episode reward: [(0, '1092.061')] [2024-12-13 04:06:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 851968. Throughput: 0: 1096.7. Samples: 855588. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:06:20,371][03180] Avg episode reward: [(0, '1098.273')] [2024-12-13 04:06:20,372][03213] Saving new best policy, reward=1098.273! [2024-12-13 04:06:23,798][03226] Updated weights for policy 0, policy_version 1680 (0.0011) [2024-12-13 04:06:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 860160. Throughput: 0: 1097.0. Samples: 859308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:06:25,371][03180] Avg episode reward: [(0, '1145.974')] [2024-12-13 04:06:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001680_860160.pth... [2024-12-13 04:06:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001616_827392.pth [2024-12-13 04:06:25,390][03213] Saving new best policy, reward=1145.974! [2024-12-13 04:06:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1083.0). Total num frames: 864256. Throughput: 0: 1116.6. Samples: 866332. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:06:30,376][03180] Avg episode reward: [(0, '1160.750')] [2024-12-13 04:06:30,377][03213] Saving new best policy, reward=1160.750! [2024-12-13 04:06:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 868352. Throughput: 0: 1095.3. Samples: 871968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:06:35,372][03180] Avg episode reward: [(0, '1190.044')] [2024-12-13 04:06:35,373][03213] Saving new best policy, reward=1190.044! [2024-12-13 04:06:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 876544. Throughput: 0: 1096.4. Samples: 875748. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:06:40,371][03180] Avg episode reward: [(0, '1222.429')] [2024-12-13 04:06:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001712_876544.pth... [2024-12-13 04:06:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001648_843776.pth [2024-12-13 04:06:40,387][03213] Saving new best policy, reward=1222.429! [2024-12-13 04:06:45,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 880640. Throughput: 0: 1119.8. Samples: 883132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:06:45,378][03180] Avg episode reward: [(0, '1240.997')] [2024-12-13 04:06:45,379][03213] Saving new best policy, reward=1240.997! [2024-12-13 04:06:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 884736. Throughput: 0: 1100.5. Samples: 888476. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:06:50,372][03180] Avg episode reward: [(0, '1247.512')] [2024-12-13 04:06:50,397][03213] Saving new best policy, reward=1247.512! [2024-12-13 04:06:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 892928. Throughput: 0: 1098.4. Samples: 892184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:06:55,371][03180] Avg episode reward: [(0, '1258.038')] [2024-12-13 04:06:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001744_892928.pth... [2024-12-13 04:06:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001680_860160.pth [2024-12-13 04:06:55,386][03213] Saving new best policy, reward=1258.038! [2024-12-13 04:07:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1083.0). Total num frames: 897024. Throughput: 0: 1115.0. Samples: 899820. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:07:00,371][03180] Avg episode reward: [(0, '1261.969')] [2024-12-13 04:07:00,416][03213] Saving new best policy, reward=1261.969! [2024-12-13 04:07:00,424][03226] Updated weights for policy 0, policy_version 1760 (0.0018) [2024-12-13 04:07:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 905216. Throughput: 0: 1100.3. Samples: 905100. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:07:05,371][03180] Avg episode reward: [(0, '1318.581')] [2024-12-13 04:07:05,372][03213] Saving new best policy, reward=1318.581! [2024-12-13 04:07:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 909312. Throughput: 0: 1102.0. Samples: 908900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:07:10,372][03180] Avg episode reward: [(0, '1341.236')] [2024-12-13 04:07:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001776_909312.pth... [2024-12-13 04:07:10,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001712_876544.pth [2024-12-13 04:07:10,389][03213] Saving new best policy, reward=1341.236! [2024-12-13 04:07:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 917504. Throughput: 0: 1115.6. Samples: 916532. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:07:15,376][03180] Avg episode reward: [(0, '1370.618')] [2024-12-13 04:07:15,377][03213] Saving new best policy, reward=1370.618! [2024-12-13 04:07:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 921600. Throughput: 0: 1111.5. Samples: 921984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:07:20,371][03180] Avg episode reward: [(0, '1380.871')] [2024-12-13 04:07:20,372][03213] Saving new best policy, reward=1380.871! [2024-12-13 04:07:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 925696. Throughput: 0: 1104.1. Samples: 925432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:07:25,371][03180] Avg episode reward: [(0, '1380.614')] [2024-12-13 04:07:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001808_925696.pth... [2024-12-13 04:07:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001744_892928.pth [2024-12-13 04:07:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1096.9). Total num frames: 933888. Throughput: 0: 1112.2. Samples: 933180. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:07:30,371][03180] Avg episode reward: [(0, '1418.655')] [2024-12-13 04:07:30,372][03213] Saving new best policy, reward=1418.655! [2024-12-13 04:07:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 937984. Throughput: 0: 1122.8. Samples: 939000. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:07:35,371][03180] Avg episode reward: [(0, '1427.613')] [2024-12-13 04:07:35,372][03213] Saving new best policy, reward=1427.613! [2024-12-13 04:07:38,113][03226] Updated weights for policy 0, policy_version 1840 (0.0010) [2024-12-13 04:07:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 942080. Throughput: 0: 1108.7. Samples: 942076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:07:40,371][03180] Avg episode reward: [(0, '1429.672')] [2024-12-13 04:07:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001840_942080.pth... [2024-12-13 04:07:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001776_909312.pth [2024-12-13 04:07:40,381][03213] Saving new best policy, reward=1429.672! [2024-12-13 04:07:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 946176. Throughput: 0: 1080.4. Samples: 948436. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:07:45,371][03180] Avg episode reward: [(0, '1455.720')] [2024-12-13 04:07:45,372][03213] Saving new best policy, reward=1455.720! [2024-12-13 04:07:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1069.1). Total num frames: 950272. Throughput: 0: 1076.1. Samples: 953524. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:07:50,371][03180] Avg episode reward: [(0, '1449.681')] [2024-12-13 04:07:55,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1083.0). Total num frames: 958464. Throughput: 0: 1049.2. Samples: 956116. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:07:55,374][03180] Avg episode reward: [(0, '1443.458')] [2024-12-13 04:07:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001872_958464.pth... [2024-12-13 04:07:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001808_925696.pth [2024-12-13 04:08:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 962560. Throughput: 0: 1050.3. Samples: 963796. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:00,371][03180] Avg episode reward: [(0, '1413.969')] [2024-12-13 04:08:05,371][03180] Fps is (10 sec: 819.4, 60 sec: 1024.0, 300 sec: 1069.1). Total num frames: 966656. Throughput: 0: 1079.2. Samples: 970548. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:05,371][03180] Avg episode reward: [(0, '1392.441')] [2024-12-13 04:08:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 974848. Throughput: 0: 1057.0. Samples: 972996. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:10,371][03180] Avg episode reward: [(0, '1431.750')] [2024-12-13 04:08:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001904_974848.pth... [2024-12-13 04:08:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001840_942080.pth [2024-12-13 04:08:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 978944. Throughput: 0: 1048.5. Samples: 980364. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:08:15,371][03180] Avg episode reward: [(0, '1418.851')] [2024-12-13 04:08:16,151][03226] Updated weights for policy 0, policy_version 1920 (0.0018) [2024-12-13 04:08:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 987136. Throughput: 0: 1075.0. Samples: 987376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:20,371][03180] Avg episode reward: [(0, '1453.641')] [2024-12-13 04:08:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 991232. Throughput: 0: 1063.6. Samples: 989940. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:25,372][03180] Avg episode reward: [(0, '1455.904')] [2024-12-13 04:08:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001936_991232.pth... [2024-12-13 04:08:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001872_958464.pth [2024-12-13 04:08:25,388][03213] Saving new best policy, reward=1455.904! [2024-12-13 04:08:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 995328. Throughput: 0: 1077.7. Samples: 996932. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:08:30,372][03180] Avg episode reward: [(0, '1444.492')] [2024-12-13 04:08:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1003520. Throughput: 0: 1131.3. Samples: 1004432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:35,371][03180] Avg episode reward: [(0, '1440.963')] [2024-12-13 04:08:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1007616. Throughput: 0: 1132.9. Samples: 1007096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:40,371][03180] Avg episode reward: [(0, '1426.215')] [2024-12-13 04:08:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000001968_1007616.pth... [2024-12-13 04:08:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001904_974848.pth [2024-12-13 04:08:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1069.1). Total num frames: 1011712. Throughput: 0: 1111.1. Samples: 1013796. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:08:45,371][03180] Avg episode reward: [(0, '1464.848')] [2024-12-13 04:08:45,372][03213] Saving new best policy, reward=1464.848! [2024-12-13 04:08:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1083.0). Total num frames: 1019904. Throughput: 0: 1131.2. Samples: 1021452. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:50,371][03180] Avg episode reward: [(0, '1461.985')] [2024-12-13 04:08:53,205][03226] Updated weights for policy 0, policy_version 2000 (0.0009) [2024-12-13 04:08:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1024000. Throughput: 0: 1132.6. Samples: 1023964. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:08:55,372][03180] Avg episode reward: [(0, '1463.954')] [2024-12-13 04:08:55,385][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002000_1024000.pth... [2024-12-13 04:08:55,398][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001936_991232.pth [2024-12-13 04:09:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1028096. Throughput: 0: 1111.5. Samples: 1030380. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:00,372][03180] Avg episode reward: [(0, '1505.201')] [2024-12-13 04:09:00,373][03213] Saving new best policy, reward=1505.201! [2024-12-13 04:09:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1036288. Throughput: 0: 1126.4. Samples: 1038064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:05,371][03180] Avg episode reward: [(0, '1502.720')] [2024-12-13 04:09:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1040384. Throughput: 0: 1132.5. Samples: 1040904. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:10,371][03180] Avg episode reward: [(0, '1500.300')] [2024-12-13 04:09:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002032_1040384.pth... [2024-12-13 04:09:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000001968_1007616.pth [2024-12-13 04:09:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1048576. Throughput: 0: 1116.5. Samples: 1047176. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:15,371][03180] Avg episode reward: [(0, '1517.938')] [2024-12-13 04:09:15,372][03213] Saving new best policy, reward=1517.938! [2024-12-13 04:09:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1052672. Throughput: 0: 1123.9. Samples: 1055008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:20,371][03180] Avg episode reward: [(0, '1635.992')] [2024-12-13 04:09:20,372][03213] Saving new best policy, reward=1635.992! [2024-12-13 04:09:25,371][03180] Fps is (10 sec: 819.1, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1056768. Throughput: 0: 1131.9. Samples: 1058032. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:09:25,372][03180] Avg episode reward: [(0, '1672.074')] [2024-12-13 04:09:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002064_1056768.pth... [2024-12-13 04:09:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002000_1024000.pth [2024-12-13 04:09:25,386][03213] Saving new best policy, reward=1672.074! [2024-12-13 04:09:29,841][03226] Updated weights for policy 0, policy_version 2080 (0.0015) [2024-12-13 04:09:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1064960. Throughput: 0: 1115.1. Samples: 1063976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:30,371][03180] Avg episode reward: [(0, '1671.019')] [2024-12-13 04:09:35,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1069056. Throughput: 0: 1116.0. Samples: 1071672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:35,371][03180] Avg episode reward: [(0, '1708.658')] [2024-12-13 04:09:35,372][03213] Saving new best policy, reward=1708.658! [2024-12-13 04:09:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1073152. Throughput: 0: 1136.5. Samples: 1075108. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:40,371][03180] Avg episode reward: [(0, '1707.839')] [2024-12-13 04:09:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002096_1073152.pth... [2024-12-13 04:09:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002032_1040384.pth [2024-12-13 04:09:45,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1081344. Throughput: 0: 1117.7. Samples: 1080680. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:45,373][03180] Avg episode reward: [(0, '1722.200')] [2024-12-13 04:09:45,374][03213] Saving new best policy, reward=1722.200! [2024-12-13 04:09:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1085440. Throughput: 0: 1116.3. Samples: 1088296. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:50,371][03180] Avg episode reward: [(0, '1776.155')] [2024-12-13 04:09:50,372][03213] Saving new best policy, reward=1776.155! [2024-12-13 04:09:55,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1093632. Throughput: 0: 1133.5. Samples: 1091912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:09:55,371][03180] Avg episode reward: [(0, '1792.446')] [2024-12-13 04:09:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002136_1093632.pth... [2024-12-13 04:09:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002064_1056768.pth [2024-12-13 04:09:55,393][03213] Saving new best policy, reward=1792.446! [2024-12-13 04:10:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1097728. Throughput: 0: 1111.3. Samples: 1097184. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:10:00,371][03180] Avg episode reward: [(0, '1805.367')] [2024-12-13 04:10:00,372][03213] Saving new best policy, reward=1805.367! [2024-12-13 04:10:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1101824. Throughput: 0: 1104.1. Samples: 1104692. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:10:05,371][03180] Avg episode reward: [(0, '1886.443')] [2024-12-13 04:10:05,372][03213] Saving new best policy, reward=1886.443! [2024-12-13 04:10:06,107][03226] Updated weights for policy 0, policy_version 2160 (0.0012) [2024-12-13 04:10:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1110016. Throughput: 0: 1120.2. Samples: 1108440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:10:10,374][03180] Avg episode reward: [(0, '1939.211')] [2024-12-13 04:10:10,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002168_1110016.pth... [2024-12-13 04:10:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002096_1073152.pth [2024-12-13 04:10:10,403][03213] Saving new best policy, reward=1939.211! [2024-12-13 04:10:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1114112. Throughput: 0: 1101.4. Samples: 1113540. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:10:15,371][03180] Avg episode reward: [(0, '1941.330')] [2024-12-13 04:10:15,372][03213] Saving new best policy, reward=1941.330! [2024-12-13 04:10:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1118208. Throughput: 0: 1100.4. Samples: 1121188. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:10:20,371][03180] Avg episode reward: [(0, '1964.811')] [2024-12-13 04:10:20,372][03213] Saving new best policy, reward=1964.811! [2024-12-13 04:10:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1126400. Throughput: 0: 1105.5. Samples: 1124856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:10:25,371][03180] Avg episode reward: [(0, '1967.814')] [2024-12-13 04:10:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002200_1126400.pth... [2024-12-13 04:10:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002136_1093632.pth [2024-12-13 04:10:25,388][03213] Saving new best policy, reward=1967.814! [2024-12-13 04:10:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1130496. Throughput: 0: 1101.5. Samples: 1130244. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:10:30,371][03180] Avg episode reward: [(0, '1944.262')] [2024-12-13 04:10:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1134592. Throughput: 0: 1100.9. Samples: 1137836. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:10:35,371][03180] Avg episode reward: [(0, '1900.349')] [2024-12-13 04:10:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1142784. Throughput: 0: 1106.6. Samples: 1141708. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:10:40,371][03180] Avg episode reward: [(0, '1904.033')] [2024-12-13 04:10:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002232_1142784.pth... [2024-12-13 04:10:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002168_1110016.pth [2024-12-13 04:10:43,483][03226] Updated weights for policy 0, policy_version 2240 (0.0017) [2024-12-13 04:10:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1146880. Throughput: 0: 1115.6. Samples: 1147384. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:10:45,371][03180] Avg episode reward: [(0, '1859.902')] [2024-12-13 04:10:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1155072. Throughput: 0: 1111.7. Samples: 1154720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:10:50,371][03180] Avg episode reward: [(0, '1906.554')] [2024-12-13 04:10:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1159168. Throughput: 0: 1115.5. Samples: 1158636. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:10:55,371][03180] Avg episode reward: [(0, '1900.185')] [2024-12-13 04:10:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002264_1159168.pth... [2024-12-13 04:10:55,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002200_1126400.pth [2024-12-13 04:11:00,374][03180] Fps is (10 sec: 818.9, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 1163264. Throughput: 0: 1131.5. Samples: 1164460. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:11:00,375][03180] Avg episode reward: [(0, '1880.480')] [2024-12-13 04:11:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1171456. Throughput: 0: 1115.6. Samples: 1171388. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:11:05,371][03180] Avg episode reward: [(0, '1871.453')] [2024-12-13 04:11:10,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1175552. Throughput: 0: 1120.7. Samples: 1175288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:11:10,371][03180] Avg episode reward: [(0, '1869.504')] [2024-12-13 04:11:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002296_1175552.pth... [2024-12-13 04:11:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002232_1142784.pth [2024-12-13 04:11:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1179648. Throughput: 0: 1136.7. Samples: 1181396. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:11:15,371][03180] Avg episode reward: [(0, '1888.208')] [2024-12-13 04:11:19,959][03226] Updated weights for policy 0, policy_version 2320 (0.0012) [2024-12-13 04:11:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1187840. Throughput: 0: 1112.8. Samples: 1187912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:11:20,371][03180] Avg episode reward: [(0, '1895.719')] [2024-12-13 04:11:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1191936. Throughput: 0: 1096.3. Samples: 1191040. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:11:25,371][03180] Avg episode reward: [(0, '1855.386')] [2024-12-13 04:11:25,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002328_1191936.pth... [2024-12-13 04:11:25,395][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002264_1159168.pth [2024-12-13 04:11:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1196032. Throughput: 0: 1076.9. Samples: 1195844. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:11:30,374][03180] Avg episode reward: [(0, '1852.854')] [2024-12-13 04:11:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1200128. Throughput: 0: 1049.2. Samples: 1201932. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:11:35,371][03180] Avg episode reward: [(0, '1798.606')] [2024-12-13 04:11:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1208320. Throughput: 0: 1048.2. Samples: 1205804. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:11:40,371][03180] Avg episode reward: [(0, '1799.361')] [2024-12-13 04:11:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002360_1208320.pth... [2024-12-13 04:11:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002296_1175552.pth [2024-12-13 04:11:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1212416. Throughput: 0: 1069.1. Samples: 1212568. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:11:45,371][03180] Avg episode reward: [(0, '1806.374')] [2024-12-13 04:11:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 1216512. Throughput: 0: 1050.8. Samples: 1218676. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:11:50,371][03180] Avg episode reward: [(0, '1875.538')] [2024-12-13 04:11:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1224704. Throughput: 0: 1047.8. Samples: 1222440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:11:55,371][03180] Avg episode reward: [(0, '1899.337')] [2024-12-13 04:11:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002392_1224704.pth... [2024-12-13 04:11:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002328_1191936.pth [2024-12-13 04:11:58,165][03226] Updated weights for policy 0, policy_version 2400 (0.0010) [2024-12-13 04:12:00,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1228800. Throughput: 0: 1070.8. Samples: 1229584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:12:00,375][03180] Avg episode reward: [(0, '1920.115')] [2024-12-13 04:12:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 1232896. Throughput: 0: 1052.3. Samples: 1235264. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:12:05,371][03180] Avg episode reward: [(0, '1975.404')] [2024-12-13 04:12:05,372][03213] Saving new best policy, reward=1975.404! [2024-12-13 04:12:10,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1241088. Throughput: 0: 1066.8. Samples: 1239048. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:12:10,371][03180] Avg episode reward: [(0, '2002.591')] [2024-12-13 04:12:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002424_1241088.pth... [2024-12-13 04:12:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002360_1208320.pth [2024-12-13 04:12:10,392][03213] Saving new best policy, reward=2002.591! [2024-12-13 04:12:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1245184. Throughput: 0: 1124.3. Samples: 1246436. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:12:15,371][03180] Avg episode reward: [(0, '1982.450')] [2024-12-13 04:12:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 1249280. Throughput: 0: 1109.5. Samples: 1251860. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:12:20,371][03180] Avg episode reward: [(0, '1962.826')] [2024-12-13 04:12:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1257472. Throughput: 0: 1108.0. Samples: 1255664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:12:25,371][03180] Avg episode reward: [(0, '1997.991')] [2024-12-13 04:12:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002456_1257472.pth... [2024-12-13 04:12:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002392_1224704.pth [2024-12-13 04:12:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1261568. Throughput: 0: 1128.0. Samples: 1263328. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:12:30,373][03180] Avg episode reward: [(0, '2036.328')] [2024-12-13 04:12:30,374][03213] Saving new best policy, reward=2036.328! [2024-12-13 04:12:35,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1265664. Throughput: 0: 1110.4. Samples: 1268648. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:12:35,376][03180] Avg episode reward: [(0, '2036.328')] [2024-12-13 04:12:35,945][03226] Updated weights for policy 0, policy_version 2480 (0.0010) [2024-12-13 04:12:40,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1273856. Throughput: 0: 1112.2. Samples: 1272488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:12:40,371][03180] Avg episode reward: [(0, '2016.841')] [2024-12-13 04:12:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002488_1273856.pth... [2024-12-13 04:12:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002424_1241088.pth [2024-12-13 04:12:45,372][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 1277952. Throughput: 0: 1120.1. Samples: 1279984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:12:45,373][03180] Avg episode reward: [(0, '2008.942')] [2024-12-13 04:12:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1282048. Throughput: 0: 1116.2. Samples: 1285492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:12:50,373][03180] Avg episode reward: [(0, '2060.817')] [2024-12-13 04:12:50,374][03213] Saving new best policy, reward=2060.817! [2024-12-13 04:12:55,374][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 1290240. Throughput: 0: 1111.3. Samples: 1289060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:12:55,375][03180] Avg episode reward: [(0, '2095.462')] [2024-12-13 04:12:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002520_1290240.pth... [2024-12-13 04:12:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002456_1257472.pth [2024-12-13 04:12:55,386][03213] Saving new best policy, reward=2095.462! [2024-12-13 04:13:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1294336. Throughput: 0: 1114.0. Samples: 1296568. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:13:00,371][03180] Avg episode reward: [(0, '2154.939')] [2024-12-13 04:13:00,372][03213] Saving new best policy, reward=2154.939! [2024-12-13 04:13:05,371][03180] Fps is (10 sec: 1229.3, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1302528. Throughput: 0: 1124.1. Samples: 1302444. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:13:05,371][03180] Avg episode reward: [(0, '2176.595')] [2024-12-13 04:13:05,372][03213] Saving new best policy, reward=2176.595! [2024-12-13 04:13:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1306624. Throughput: 0: 1110.3. Samples: 1305628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:13:10,371][03180] Avg episode reward: [(0, '2137.148')] [2024-12-13 04:13:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002552_1306624.pth... [2024-12-13 04:13:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002488_1273856.pth [2024-12-13 04:13:12,050][03226] Updated weights for policy 0, policy_version 2560 (0.0011) [2024-12-13 04:13:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1314816. Throughput: 0: 1110.5. Samples: 1313300. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:13:15,373][03180] Avg episode reward: [(0, '2125.358')] [2024-12-13 04:13:20,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1318912. Throughput: 0: 1125.8. Samples: 1319308. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:13:20,374][03180] Avg episode reward: [(0, '2143.510')] [2024-12-13 04:13:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1323008. Throughput: 0: 1106.6. Samples: 1322284. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:13:25,371][03180] Avg episode reward: [(0, '2166.575')] [2024-12-13 04:13:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002584_1323008.pth... [2024-12-13 04:13:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002520_1290240.pth [2024-12-13 04:13:30,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 1331200. Throughput: 0: 1108.7. Samples: 1329872. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:13:30,371][03180] Avg episode reward: [(0, '2136.334')] [2024-12-13 04:13:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 1335296. Throughput: 0: 1128.1. Samples: 1336256. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:13:35,372][03180] Avg episode reward: [(0, '2109.489')] [2024-12-13 04:13:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1339392. Throughput: 0: 1107.3. Samples: 1338884. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:13:40,371][03180] Avg episode reward: [(0, '2124.387')] [2024-12-13 04:13:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002616_1339392.pth... [2024-12-13 04:13:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002552_1306624.pth [2024-12-13 04:13:45,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 1347584. Throughput: 0: 1107.2. Samples: 1346392. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:13:45,371][03180] Avg episode reward: [(0, '2132.273')] [2024-12-13 04:13:48,700][03226] Updated weights for policy 0, policy_version 2640 (0.0016) [2024-12-13 04:13:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1351680. Throughput: 0: 1118.6. Samples: 1352780. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:13:50,371][03180] Avg episode reward: [(0, '2173.091')] [2024-12-13 04:13:55,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1355776. Throughput: 0: 1104.3. Samples: 1355324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:13:55,373][03180] Avg episode reward: [(0, '2196.688')] [2024-12-13 04:13:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002648_1355776.pth... [2024-12-13 04:13:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002584_1323008.pth [2024-12-13 04:13:55,383][03213] Saving new best policy, reward=2196.688! [2024-12-13 04:14:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1363968. Throughput: 0: 1096.5. Samples: 1362644. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:14:00,371][03180] Avg episode reward: [(0, '2197.042')] [2024-12-13 04:14:00,372][03213] Saving new best policy, reward=2197.042! [2024-12-13 04:14:05,374][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1368064. Throughput: 0: 1112.0. Samples: 1369344. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:14:05,374][03180] Avg episode reward: [(0, '2188.460')] [2024-12-13 04:14:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1372160. Throughput: 0: 1101.3. Samples: 1371844. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:14:10,371][03180] Avg episode reward: [(0, '2161.930')] [2024-12-13 04:14:10,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002680_1372160.pth... [2024-12-13 04:14:10,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002616_1339392.pth [2024-12-13 04:14:15,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 1380352. Throughput: 0: 1085.3. Samples: 1378716. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:14:15,376][03180] Avg episode reward: [(0, '2145.763')] [2024-12-13 04:14:20,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1384448. Throughput: 0: 1100.9. Samples: 1385800. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:14:20,374][03180] Avg episode reward: [(0, '2138.197')] [2024-12-13 04:14:25,371][03180] Fps is (10 sec: 819.6, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1388544. Throughput: 0: 1098.9. Samples: 1388336. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:14:25,371][03180] Avg episode reward: [(0, '2137.084')] [2024-12-13 04:14:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002712_1388544.pth... [2024-12-13 04:14:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002648_1355776.pth [2024-12-13 04:14:27,188][03226] Updated weights for policy 0, policy_version 2720 (0.0013) [2024-12-13 04:14:30,371][03180] Fps is (10 sec: 819.4, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 1392640. Throughput: 0: 1080.9. Samples: 1395032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:14:30,372][03180] Avg episode reward: [(0, '2087.694')] [2024-12-13 04:14:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1400832. Throughput: 0: 1106.2. Samples: 1402560. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:14:35,371][03180] Avg episode reward: [(0, '2150.559')] [2024-12-13 04:14:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1404928. Throughput: 0: 1102.5. Samples: 1404936. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:14:40,371][03180] Avg episode reward: [(0, '2131.585')] [2024-12-13 04:14:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002744_1404928.pth... [2024-12-13 04:14:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002680_1372160.pth [2024-12-13 04:14:45,375][03180] Fps is (10 sec: 1228.3, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 1413120. Throughput: 0: 1088.5. Samples: 1411632. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:14:45,375][03180] Avg episode reward: [(0, '2116.439')] [2024-12-13 04:14:50,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1417216. Throughput: 0: 1108.2. Samples: 1419216. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:14:50,374][03180] Avg episode reward: [(0, '2142.094')] [2024-12-13 04:14:55,371][03180] Fps is (10 sec: 819.5, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1421312. Throughput: 0: 1111.0. Samples: 1421840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:14:55,371][03180] Avg episode reward: [(0, '2141.825')] [2024-12-13 04:14:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002776_1421312.pth... [2024-12-13 04:14:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002712_1388544.pth [2024-12-13 04:15:00,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 1429504. Throughput: 0: 1098.6. Samples: 1428148. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:15:00,371][03180] Avg episode reward: [(0, '2241.108')] [2024-12-13 04:15:00,372][03213] Saving new best policy, reward=2241.108! [2024-12-13 04:15:03,751][03226] Updated weights for policy 0, policy_version 2800 (0.0009) [2024-12-13 04:15:05,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1433600. Throughput: 0: 1085.9. Samples: 1434664. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:15:05,374][03180] Avg episode reward: [(0, '2239.342')] [2024-12-13 04:15:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1437696. Throughput: 0: 1079.3. Samples: 1436904. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:15:10,376][03180] Avg episode reward: [(0, '2223.192')] [2024-12-13 04:15:10,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002808_1437696.pth... [2024-12-13 04:15:10,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002744_1404928.pth [2024-12-13 04:15:15,371][03180] Fps is (10 sec: 819.3, 60 sec: 1024.1, 300 sec: 1096.9). Total num frames: 1441792. Throughput: 0: 1040.9. Samples: 1441872. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:15:15,371][03180] Avg episode reward: [(0, '2196.508')] [2024-12-13 04:15:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1445888. Throughput: 0: 1046.3. Samples: 1449644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:15:20,371][03180] Avg episode reward: [(0, '2193.097')] [2024-12-13 04:15:25,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1454080. Throughput: 0: 1079.2. Samples: 1453504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:15:25,375][03180] Avg episode reward: [(0, '2228.239')] [2024-12-13 04:15:25,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002840_1454080.pth... [2024-12-13 04:15:25,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002776_1421312.pth [2024-12-13 04:15:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1458176. Throughput: 0: 1040.9. Samples: 1458468. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:15:30,371][03180] Avg episode reward: [(0, '2267.398')] [2024-12-13 04:15:30,372][03213] Saving new best policy, reward=2267.398! [2024-12-13 04:15:35,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1466368. Throughput: 0: 1046.3. Samples: 1466296. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:15:35,371][03180] Avg episode reward: [(0, '2277.499')] [2024-12-13 04:15:35,373][03213] Saving new best policy, reward=2277.499! [2024-12-13 04:15:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1470464. Throughput: 0: 1074.7. Samples: 1470200. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:15:40,371][03180] Avg episode reward: [(0, '2301.931')] [2024-12-13 04:15:40,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002872_1470464.pth... [2024-12-13 04:15:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002808_1437696.pth [2024-12-13 04:15:40,388][03213] Saving new best policy, reward=2301.931! [2024-12-13 04:15:43,017][03226] Updated weights for policy 0, policy_version 2880 (0.0010) [2024-12-13 04:15:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.1, 300 sec: 1083.0). Total num frames: 1474560. Throughput: 0: 1051.0. Samples: 1475444. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:15:45,372][03180] Avg episode reward: [(0, '2354.138')] [2024-12-13 04:15:45,372][03213] Saving new best policy, reward=2354.138! [2024-12-13 04:15:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1482752. Throughput: 0: 1069.9. Samples: 1482808. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:15:50,372][03180] Avg episode reward: [(0, '2336.871')] [2024-12-13 04:15:55,378][03180] Fps is (10 sec: 1227.9, 60 sec: 1092.1, 300 sec: 1096.9). Total num frames: 1486848. Throughput: 0: 1105.2. Samples: 1486648. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:15:55,379][03180] Avg episode reward: [(0, '2353.948')] [2024-12-13 04:15:55,395][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002904_1486848.pth... [2024-12-13 04:15:55,400][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002840_1454080.pth [2024-12-13 04:16:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1490944. Throughput: 0: 1117.8. Samples: 1492172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:16:00,373][03180] Avg episode reward: [(0, '2335.964')] [2024-12-13 04:16:05,371][03180] Fps is (10 sec: 1229.7, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1499136. Throughput: 0: 1102.9. Samples: 1499276. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:16:05,371][03180] Avg episode reward: [(0, '2418.739')] [2024-12-13 04:16:05,372][03213] Saving new best policy, reward=2418.739! [2024-12-13 04:16:10,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1503232. Throughput: 0: 1104.4. Samples: 1503204. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:16:10,376][03180] Avg episode reward: [(0, '2402.655')] [2024-12-13 04:16:10,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002936_1503232.pth... [2024-12-13 04:16:10,394][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002872_1470464.pth [2024-12-13 04:16:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1507328. Throughput: 0: 1125.1. Samples: 1509096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:16:15,371][03180] Avg episode reward: [(0, '2407.342')] [2024-12-13 04:16:19,686][03226] Updated weights for policy 0, policy_version 2960 (0.0012) [2024-12-13 04:16:20,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1515520. Throughput: 0: 1103.7. Samples: 1515964. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:16:20,374][03180] Avg episode reward: [(0, '2418.169')] [2024-12-13 04:16:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1519616. Throughput: 0: 1103.3. Samples: 1519848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:16:25,371][03180] Avg episode reward: [(0, '2456.789')] [2024-12-13 04:16:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000002968_1519616.pth... [2024-12-13 04:16:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002904_1486848.pth [2024-12-13 04:16:25,390][03213] Saving new best policy, reward=2456.789! [2024-12-13 04:16:30,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1523712. Throughput: 0: 1125.4. Samples: 1526092. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:16:30,374][03180] Avg episode reward: [(0, '2456.789')] [2024-12-13 04:16:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1531904. Throughput: 0: 1108.4. Samples: 1532684. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:16:35,371][03180] Avg episode reward: [(0, '2490.540')] [2024-12-13 04:16:35,372][03213] Saving new best policy, reward=2490.540! [2024-12-13 04:16:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1536000. Throughput: 0: 1106.9. Samples: 1536452. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:16:40,371][03180] Avg episode reward: [(0, '2500.910')] [2024-12-13 04:16:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003000_1536000.pth... [2024-12-13 04:16:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002936_1503232.pth [2024-12-13 04:16:40,386][03213] Saving new best policy, reward=2500.910! [2024-12-13 04:16:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1540096. Throughput: 0: 1127.3. Samples: 1542900. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:16:45,371][03180] Avg episode reward: [(0, '2542.286')] [2024-12-13 04:16:45,372][03213] Saving new best policy, reward=2542.286! [2024-12-13 04:16:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1548288. Throughput: 0: 1107.4. Samples: 1549108. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:16:50,371][03180] Avg episode reward: [(0, '2622.762')] [2024-12-13 04:16:50,372][03213] Saving new best policy, reward=2622.762! [2024-12-13 04:16:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1096.9). Total num frames: 1552384. Throughput: 0: 1105.1. Samples: 1552928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:16:55,372][03180] Avg episode reward: [(0, '2642.543')] [2024-12-13 04:16:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003032_1552384.pth... [2024-12-13 04:16:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000002968_1519616.pth [2024-12-13 04:16:55,386][03213] Saving new best policy, reward=2642.543! [2024-12-13 04:16:55,874][03226] Updated weights for policy 0, policy_version 3040 (0.0013) [2024-12-13 04:17:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1560576. Throughput: 0: 1125.6. Samples: 1559748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:00,371][03180] Avg episode reward: [(0, '2649.951')] [2024-12-13 04:17:00,372][03213] Saving new best policy, reward=2649.951! [2024-12-13 04:17:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1564672. Throughput: 0: 1103.4. Samples: 1565616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:05,371][03180] Avg episode reward: [(0, '2672.380')] [2024-12-13 04:17:05,372][03213] Saving new best policy, reward=2672.380! [2024-12-13 04:17:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.4, 300 sec: 1096.9). Total num frames: 1568768. Throughput: 0: 1098.8. Samples: 1569292. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:10,371][03180] Avg episode reward: [(0, '2689.019')] [2024-12-13 04:17:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003064_1568768.pth... [2024-12-13 04:17:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003000_1536000.pth [2024-12-13 04:17:10,386][03213] Saving new best policy, reward=2689.019! [2024-12-13 04:17:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1576960. Throughput: 0: 1115.4. Samples: 1576280. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:17:15,374][03180] Avg episode reward: [(0, '2716.622')] [2024-12-13 04:17:15,375][03213] Saving new best policy, reward=2716.622! [2024-12-13 04:17:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1581056. Throughput: 0: 1091.9. Samples: 1581820. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:20,371][03180] Avg episode reward: [(0, '2732.698')] [2024-12-13 04:17:20,372][03213] Saving new best policy, reward=2732.698! [2024-12-13 04:17:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1585152. Throughput: 0: 1093.2. Samples: 1585648. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:25,371][03180] Avg episode reward: [(0, '2833.913')] [2024-12-13 04:17:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003096_1585152.pth... [2024-12-13 04:17:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003032_1552384.pth [2024-12-13 04:17:25,384][03213] Saving new best policy, reward=2833.913! [2024-12-13 04:17:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 1593344. Throughput: 0: 1115.6. Samples: 1593104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:30,371][03180] Avg episode reward: [(0, '2816.401')] [2024-12-13 04:17:34,301][03226] Updated weights for policy 0, policy_version 3120 (0.0010) [2024-12-13 04:17:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1597440. Throughput: 0: 1095.1. Samples: 1598388. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:35,371][03180] Avg episode reward: [(0, '2821.501')] [2024-12-13 04:17:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1601536. Throughput: 0: 1096.4. Samples: 1602264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:40,371][03180] Avg episode reward: [(0, '2828.100')] [2024-12-13 04:17:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003128_1601536.pth... [2024-12-13 04:17:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003064_1568768.pth [2024-12-13 04:17:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1609728. Throughput: 0: 1116.1. Samples: 1609972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:45,371][03180] Avg episode reward: [(0, '2842.379')] [2024-12-13 04:17:45,372][03213] Saving new best policy, reward=2842.379! [2024-12-13 04:17:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1613824. Throughput: 0: 1100.1. Samples: 1615120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:50,371][03180] Avg episode reward: [(0, '2846.225')] [2024-12-13 04:17:50,373][03213] Saving new best policy, reward=2846.225! [2024-12-13 04:17:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1617920. Throughput: 0: 1103.4. Samples: 1618944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:17:55,372][03180] Avg episode reward: [(0, '2800.952')] [2024-12-13 04:17:55,418][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003168_1622016.pth... [2024-12-13 04:17:55,423][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003096_1585152.pth [2024-12-13 04:18:00,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1626112. Throughput: 0: 1116.3. Samples: 1626516. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:18:00,374][03180] Avg episode reward: [(0, '2783.031')] [2024-12-13 04:18:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1630208. Throughput: 0: 1109.1. Samples: 1631728. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:18:05,371][03180] Avg episode reward: [(0, '2775.045')] [2024-12-13 04:18:10,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1634304. Throughput: 0: 1105.6. Samples: 1635400. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:18:10,372][03180] Avg episode reward: [(0, '2744.758')] [2024-12-13 04:18:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003192_1634304.pth... [2024-12-13 04:18:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003128_1601536.pth [2024-12-13 04:18:10,467][03226] Updated weights for policy 0, policy_version 3200 (0.0017) [2024-12-13 04:18:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1642496. Throughput: 0: 1106.9. Samples: 1642916. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:18:15,371][03180] Avg episode reward: [(0, '2690.826')] [2024-12-13 04:18:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1646592. Throughput: 0: 1113.5. Samples: 1648496. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:18:20,371][03180] Avg episode reward: [(0, '2656.038')] [2024-12-13 04:18:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1654784. Throughput: 0: 1105.3. Samples: 1652004. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:18:25,371][03180] Avg episode reward: [(0, '2623.562')] [2024-12-13 04:18:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003232_1654784.pth... [2024-12-13 04:18:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003168_1622016.pth [2024-12-13 04:18:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1658880. Throughput: 0: 1097.4. Samples: 1659356. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:18:30,375][03180] Avg episode reward: [(0, '2505.898')] [2024-12-13 04:18:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1662976. Throughput: 0: 1112.6. Samples: 1665188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:18:35,371][03180] Avg episode reward: [(0, '2511.080')] [2024-12-13 04:18:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1671168. Throughput: 0: 1097.2. Samples: 1668316. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:18:40,371][03180] Avg episode reward: [(0, '2491.623')] [2024-12-13 04:18:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003264_1671168.pth... [2024-12-13 04:18:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003192_1634304.pth [2024-12-13 04:18:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1675264. Throughput: 0: 1095.9. Samples: 1675828. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:18:45,371][03180] Avg episode reward: [(0, '2424.104')] [2024-12-13 04:18:48,601][03226] Updated weights for policy 0, policy_version 3280 (0.0010) [2024-12-13 04:18:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1679360. Throughput: 0: 1079.9. Samples: 1680324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:18:50,372][03180] Avg episode reward: [(0, '2370.873')] [2024-12-13 04:18:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1683456. Throughput: 0: 1049.9. Samples: 1682644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:18:55,371][03180] Avg episode reward: [(0, '2286.090')] [2024-12-13 04:18:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003288_1683456.pth... [2024-12-13 04:18:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003232_1654784.pth [2024-12-13 04:19:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1687552. Throughput: 0: 1042.8. Samples: 1689840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:19:00,372][03180] Avg episode reward: [(0, '2260.227')] [2024-12-13 04:19:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1695744. Throughput: 0: 1078.2. Samples: 1697016. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:19:05,371][03180] Avg episode reward: [(0, '2268.544')] [2024-12-13 04:19:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1699840. Throughput: 0: 1058.0. Samples: 1699612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:19:10,371][03180] Avg episode reward: [(0, '2243.114')] [2024-12-13 04:19:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003320_1699840.pth... [2024-12-13 04:19:10,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003264_1671168.pth [2024-12-13 04:19:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1703936. Throughput: 0: 1042.9. Samples: 1706288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:19:15,371][03180] Avg episode reward: [(0, '2182.090')] [2024-12-13 04:19:20,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1712128. Throughput: 0: 1081.1. Samples: 1713840. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:19:20,374][03180] Avg episode reward: [(0, '2159.901')] [2024-12-13 04:19:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 1716224. Throughput: 0: 1066.0. Samples: 1716288. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:19:25,371][03180] Avg episode reward: [(0, '2136.278')] [2024-12-13 04:19:25,387][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003352_1716224.pth... [2024-12-13 04:19:25,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003288_1683456.pth [2024-12-13 04:19:27,131][03226] Updated weights for policy 0, policy_version 3360 (0.0015) [2024-12-13 04:19:30,371][03180] Fps is (10 sec: 819.4, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1720320. Throughput: 0: 1044.4. Samples: 1722824. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:19:30,372][03180] Avg episode reward: [(0, '2162.269')] [2024-12-13 04:19:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1728512. Throughput: 0: 1110.2. Samples: 1730284. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:19:35,371][03180] Avg episode reward: [(0, '2137.975')] [2024-12-13 04:19:40,376][03180] Fps is (10 sec: 1228.1, 60 sec: 1023.9, 300 sec: 1083.0). Total num frames: 1732608. Throughput: 0: 1116.8. Samples: 1732904. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:19:40,377][03180] Avg episode reward: [(0, '2150.810')] [2024-12-13 04:19:40,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003384_1732608.pth... [2024-12-13 04:19:40,395][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003320_1699840.pth [2024-12-13 04:19:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1740800. Throughput: 0: 1094.5. Samples: 1739092. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:19:45,371][03180] Avg episode reward: [(0, '2112.760')] [2024-12-13 04:19:50,371][03180] Fps is (10 sec: 1229.5, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1744896. Throughput: 0: 1104.8. Samples: 1746732. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:19:50,371][03180] Avg episode reward: [(0, '2090.918')] [2024-12-13 04:19:55,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1083.0). Total num frames: 1748992. Throughput: 0: 1114.5. Samples: 1749768. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:19:55,374][03180] Avg episode reward: [(0, '2091.464')] [2024-12-13 04:19:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003416_1748992.pth... [2024-12-13 04:19:55,411][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003352_1716224.pth [2024-12-13 04:20:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1757184. Throughput: 0: 1097.8. Samples: 1755688. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:20:00,371][03180] Avg episode reward: [(0, '2087.974')] [2024-12-13 04:20:03,519][03226] Updated weights for policy 0, policy_version 3440 (0.0014) [2024-12-13 04:20:05,373][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1761280. Throughput: 0: 1097.6. Samples: 1763232. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:20:05,374][03180] Avg episode reward: [(0, '2091.183')] [2024-12-13 04:20:10,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1765376. Throughput: 0: 1116.6. Samples: 1766536. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:20:10,373][03180] Avg episode reward: [(0, '2147.917')] [2024-12-13 04:20:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003448_1765376.pth... [2024-12-13 04:20:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003384_1732608.pth [2024-12-13 04:20:15,372][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1773568. Throughput: 0: 1094.7. Samples: 1772088. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:20:15,373][03180] Avg episode reward: [(0, '2156.592')] [2024-12-13 04:20:20,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1777664. Throughput: 0: 1096.9. Samples: 1779644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:20:20,371][03180] Avg episode reward: [(0, '2264.726')] [2024-12-13 04:20:25,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1781760. Throughput: 0: 1120.9. Samples: 1783340. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:20:25,371][03180] Avg episode reward: [(0, '2274.255')] [2024-12-13 04:20:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003480_1781760.pth... [2024-12-13 04:20:25,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003416_1748992.pth [2024-12-13 04:20:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 1789952. Throughput: 0: 1097.7. Samples: 1788488. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:20:30,371][03180] Avg episode reward: [(0, '2344.631')] [2024-12-13 04:20:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1794048. Throughput: 0: 1096.2. Samples: 1796060. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:20:35,371][03180] Avg episode reward: [(0, '2364.647')] [2024-12-13 04:20:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.4, 300 sec: 1096.9). Total num frames: 1798144. Throughput: 0: 1112.9. Samples: 1799844. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:20:40,371][03180] Avg episode reward: [(0, '2354.289')] [2024-12-13 04:20:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003512_1798144.pth... [2024-12-13 04:20:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003448_1765376.pth [2024-12-13 04:20:40,707][03226] Updated weights for policy 0, policy_version 3520 (0.0009) [2024-12-13 04:20:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1806336. Throughput: 0: 1098.6. Samples: 1805124. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:20:45,371][03180] Avg episode reward: [(0, '2376.878')] [2024-12-13 04:20:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1810432. Throughput: 0: 1100.0. Samples: 1812728. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:20:50,371][03180] Avg episode reward: [(0, '2409.249')] [2024-12-13 04:20:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 1818624. Throughput: 0: 1110.7. Samples: 1816516. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:20:55,371][03180] Avg episode reward: [(0, '2392.632')] [2024-12-13 04:20:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003552_1818624.pth... [2024-12-13 04:20:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003480_1781760.pth [2024-12-13 04:21:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1822720. Throughput: 0: 1106.5. Samples: 1821880. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:21:00,372][03180] Avg episode reward: [(0, '2417.819')] [2024-12-13 04:21:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1826816. Throughput: 0: 1103.1. Samples: 1829284. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:21:05,371][03180] Avg episode reward: [(0, '2412.062')] [2024-12-13 04:21:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 1835008. Throughput: 0: 1104.5. Samples: 1833044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:21:10,371][03180] Avg episode reward: [(0, '2563.127')] [2024-12-13 04:21:10,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003584_1835008.pth... [2024-12-13 04:21:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003512_1798144.pth [2024-12-13 04:21:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1839104. Throughput: 0: 1115.6. Samples: 1838692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:21:15,371][03180] Avg episode reward: [(0, '2658.724')] [2024-12-13 04:21:18,149][03226] Updated weights for policy 0, policy_version 3600 (0.0010) [2024-12-13 04:21:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1843200. Throughput: 0: 1102.3. Samples: 1845664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:21:20,371][03180] Avg episode reward: [(0, '2659.509')] [2024-12-13 04:21:25,372][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1851392. Throughput: 0: 1101.4. Samples: 1849408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:21:25,374][03180] Avg episode reward: [(0, '2698.088')] [2024-12-13 04:21:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003616_1851392.pth... [2024-12-13 04:21:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003552_1818624.pth [2024-12-13 04:21:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1855488. Throughput: 0: 1115.9. Samples: 1855344. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:21:30,374][03180] Avg episode reward: [(0, '2652.025')] [2024-12-13 04:21:35,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1859584. Throughput: 0: 1094.5. Samples: 1861980. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:21:35,371][03180] Avg episode reward: [(0, '2639.162')] [2024-12-13 04:21:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 1867776. Throughput: 0: 1093.6. Samples: 1865728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:21:40,371][03180] Avg episode reward: [(0, '2677.851')] [2024-12-13 04:21:40,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003648_1867776.pth... [2024-12-13 04:21:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003584_1835008.pth [2024-12-13 04:21:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1871872. Throughput: 0: 1110.9. Samples: 1871872. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:21:45,371][03180] Avg episode reward: [(0, '2666.185')] [2024-12-13 04:21:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1875968. Throughput: 0: 1089.8. Samples: 1878324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:21:50,371][03180] Avg episode reward: [(0, '2648.906')] [2024-12-13 04:21:54,746][03226] Updated weights for policy 0, policy_version 3680 (0.0009) [2024-12-13 04:21:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1884160. Throughput: 0: 1090.9. Samples: 1882136. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:21:55,371][03180] Avg episode reward: [(0, '2589.306')] [2024-12-13 04:21:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003680_1884160.pth... [2024-12-13 04:21:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003616_1851392.pth [2024-12-13 04:22:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1888256. Throughput: 0: 1107.3. Samples: 1888520. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:22:00,372][03180] Avg episode reward: [(0, '2614.659')] [2024-12-13 04:22:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1892352. Throughput: 0: 1090.3. Samples: 1894728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:22:05,371][03180] Avg episode reward: [(0, '2665.242')] [2024-12-13 04:22:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1900544. Throughput: 0: 1090.2. Samples: 1898468. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:22:10,371][03180] Avg episode reward: [(0, '2631.602')] [2024-12-13 04:22:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003712_1900544.pth... [2024-12-13 04:22:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003648_1867776.pth [2024-12-13 04:22:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1904640. Throughput: 0: 1109.1. Samples: 1905252. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:22:15,371][03180] Avg episode reward: [(0, '2677.235')] [2024-12-13 04:22:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1908736. Throughput: 0: 1092.6. Samples: 1911148. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:22:20,371][03180] Avg episode reward: [(0, '2664.358')] [2024-12-13 04:22:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1916928. Throughput: 0: 1095.5. Samples: 1915024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:22:25,372][03180] Avg episode reward: [(0, '2647.607')] [2024-12-13 04:22:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003744_1916928.pth... [2024-12-13 04:22:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003680_1884160.pth [2024-12-13 04:22:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1921024. Throughput: 0: 1092.0. Samples: 1921016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:22:30,374][03180] Avg episode reward: [(0, '2562.426')] [2024-12-13 04:22:35,014][03226] Updated weights for policy 0, policy_version 3760 (0.0014) [2024-12-13 04:22:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1925120. Throughput: 0: 1041.3. Samples: 1925184. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:22:35,371][03180] Avg episode reward: [(0, '2506.069')] [2024-12-13 04:22:40,371][03180] Fps is (10 sec: 819.4, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1929216. Throughput: 0: 1039.6. Samples: 1928920. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:22:40,371][03180] Avg episode reward: [(0, '2507.269')] [2024-12-13 04:22:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003768_1929216.pth... [2024-12-13 04:22:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003712_1900544.pth [2024-12-13 04:22:45,375][03180] Fps is (10 sec: 1228.3, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1937408. Throughput: 0: 1069.0. Samples: 1936628. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:22:45,375][03180] Avg episode reward: [(0, '2528.187')] [2024-12-13 04:22:50,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1941504. Throughput: 0: 1055.3. Samples: 1942220. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:22:50,375][03180] Avg episode reward: [(0, '2541.731')] [2024-12-13 04:22:55,371][03180] Fps is (10 sec: 819.5, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1945600. Throughput: 0: 1046.5. Samples: 1945560. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:22:55,371][03180] Avg episode reward: [(0, '2518.796')] [2024-12-13 04:22:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003800_1945600.pth... [2024-12-13 04:22:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003744_1916928.pth [2024-12-13 04:23:00,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1953792. Throughput: 0: 1063.0. Samples: 1953088. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:23:00,371][03180] Avg episode reward: [(0, '2518.497')] [2024-12-13 04:23:05,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 1957888. Throughput: 0: 1063.2. Samples: 1958996. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:05,374][03180] Avg episode reward: [(0, '2539.929')] [2024-12-13 04:23:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1961984. Throughput: 0: 1045.0. Samples: 1962048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:10,371][03180] Avg episode reward: [(0, '2576.875')] [2024-12-13 04:23:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003832_1961984.pth... [2024-12-13 04:23:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003768_1929216.pth [2024-12-13 04:23:11,232][03226] Updated weights for policy 0, policy_version 3840 (0.0010) [2024-12-13 04:23:15,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1970176. Throughput: 0: 1083.9. Samples: 1969788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:15,371][03180] Avg episode reward: [(0, '2522.743')] [2024-12-13 04:23:20,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1083.0). Total num frames: 1974272. Throughput: 0: 1126.7. Samples: 1975888. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:20,373][03180] Avg episode reward: [(0, '2552.034')] [2024-12-13 04:23:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 1978368. Throughput: 0: 1104.7. Samples: 1978632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:25,371][03180] Avg episode reward: [(0, '2531.316')] [2024-12-13 04:23:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003864_1978368.pth... [2024-12-13 04:23:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003800_1945600.pth [2024-12-13 04:23:30,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 1986560. Throughput: 0: 1105.6. Samples: 1986376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:30,372][03180] Avg episode reward: [(0, '2555.870')] [2024-12-13 04:23:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1990656. Throughput: 0: 1128.2. Samples: 1992988. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:35,371][03180] Avg episode reward: [(0, '2541.416')] [2024-12-13 04:23:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 1994752. Throughput: 0: 1107.4. Samples: 1995392. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:23:40,372][03180] Avg episode reward: [(0, '2577.947')] [2024-12-13 04:23:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003896_1994752.pth... [2024-12-13 04:23:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003832_1961984.pth [2024-12-13 04:23:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2002944. Throughput: 0: 1105.9. Samples: 2002852. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:45,371][03180] Avg episode reward: [(0, '2642.312')] [2024-12-13 04:23:47,339][03226] Updated weights for policy 0, policy_version 3920 (0.0009) [2024-12-13 04:23:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2007040. Throughput: 0: 1131.6. Samples: 2009916. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:23:50,371][03180] Avg episode reward: [(0, '2724.949')] [2024-12-13 04:23:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2011136. Throughput: 0: 1114.5. Samples: 2012200. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:23:55,371][03180] Avg episode reward: [(0, '2763.410')] [2024-12-13 04:23:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003928_2011136.pth... [2024-12-13 04:23:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003864_1978368.pth [2024-12-13 04:24:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2019328. Throughput: 0: 1100.9. Samples: 2019328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:00,371][03180] Avg episode reward: [(0, '2770.118')] [2024-12-13 04:24:05,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2023424. Throughput: 0: 1124.7. Samples: 2026500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:05,374][03180] Avg episode reward: [(0, '2812.681')] [2024-12-13 04:24:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2027520. Throughput: 0: 1117.9. Samples: 2028936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:10,372][03180] Avg episode reward: [(0, '2807.337')] [2024-12-13 04:24:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003960_2027520.pth... [2024-12-13 04:24:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003896_1994752.pth [2024-12-13 04:24:15,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2035712. Throughput: 0: 1096.4. Samples: 2035712. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:24:15,371][03180] Avg episode reward: [(0, '2834.460')] [2024-12-13 04:24:20,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 2039808. Throughput: 0: 1115.1. Samples: 2043172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:20,375][03180] Avg episode reward: [(0, '2816.926')] [2024-12-13 04:24:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2043904. Throughput: 0: 1117.3. Samples: 2045672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:25,372][03180] Avg episode reward: [(0, '2804.613')] [2024-12-13 04:24:25,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000003992_2043904.pth... [2024-12-13 04:24:25,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003928_2011136.pth [2024-12-13 04:24:25,769][03226] Updated weights for policy 0, policy_version 4000 (0.0009) [2024-12-13 04:24:30,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2052096. Throughput: 0: 1094.0. Samples: 2052084. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:30,371][03180] Avg episode reward: [(0, '2826.888')] [2024-12-13 04:24:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2056192. Throughput: 0: 1108.3. Samples: 2059788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:35,371][03180] Avg episode reward: [(0, '2790.342')] [2024-12-13 04:24:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 2060288. Throughput: 0: 1117.5. Samples: 2062488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:40,372][03180] Avg episode reward: [(0, '2772.497')] [2024-12-13 04:24:40,445][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004032_2064384.pth... [2024-12-13 04:24:40,453][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003960_2027520.pth [2024-12-13 04:24:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2068480. Throughput: 0: 1092.7. Samples: 2068500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:24:45,371][03180] Avg episode reward: [(0, '2801.396')] [2024-12-13 04:24:50,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2076672. Throughput: 0: 1108.2. Samples: 2076368. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:24:50,371][03180] Avg episode reward: [(0, '2817.950')] [2024-12-13 04:24:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 2080768. Throughput: 0: 1121.2. Samples: 2079392. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:24:55,371][03180] Avg episode reward: [(0, '2868.280')] [2024-12-13 04:24:55,397][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004064_2080768.pth... [2024-12-13 04:24:55,411][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000003992_2043904.pth [2024-12-13 04:24:55,413][03213] Saving new best policy, reward=2868.280! [2024-12-13 04:25:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2084864. Throughput: 0: 1096.7. Samples: 2085064. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:25:00,377][03180] Avg episode reward: [(0, '2815.089')] [2024-12-13 04:25:02,075][03226] Updated weights for policy 0, policy_version 4080 (0.0009) [2024-12-13 04:25:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 2093056. Throughput: 0: 1104.0. Samples: 2092848. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:25:05,371][03180] Avg episode reward: [(0, '2772.226')] [2024-12-13 04:25:10,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 2097152. Throughput: 0: 1123.2. Samples: 2096220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:25:10,375][03180] Avg episode reward: [(0, '2727.171')] [2024-12-13 04:25:10,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004096_2097152.pth... [2024-12-13 04:25:10,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004032_2064384.pth [2024-12-13 04:25:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2101248. Throughput: 0: 1106.0. Samples: 2101852. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:25:15,371][03180] Avg episode reward: [(0, '2735.516')] [2024-12-13 04:25:20,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 2109440. Throughput: 0: 1109.8. Samples: 2109728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:25:20,371][03180] Avg episode reward: [(0, '2677.037')] [2024-12-13 04:25:25,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 2113536. Throughput: 0: 1134.4. Samples: 2113536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:25:25,373][03180] Avg episode reward: [(0, '2690.688')] [2024-12-13 04:25:25,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004128_2113536.pth... [2024-12-13 04:25:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004064_2080768.pth [2024-12-13 04:25:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2117632. Throughput: 0: 1118.8. Samples: 2118844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:25:30,371][03180] Avg episode reward: [(0, '2727.771')] [2024-12-13 04:25:35,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2125824. Throughput: 0: 1117.3. Samples: 2126648. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:25:35,371][03180] Avg episode reward: [(0, '2769.445')] [2024-12-13 04:25:37,581][03226] Updated weights for policy 0, policy_version 4160 (0.0009) [2024-12-13 04:25:40,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 2129920. Throughput: 0: 1136.0. Samples: 2130516. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:25:40,373][03180] Avg episode reward: [(0, '2861.081')] [2024-12-13 04:25:40,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004160_2129920.pth... [2024-12-13 04:25:40,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004096_2097152.pth [2024-12-13 04:25:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2134016. Throughput: 0: 1125.9. Samples: 2135728. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:25:45,371][03180] Avg episode reward: [(0, '2845.098')] [2024-12-13 04:25:50,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2142208. Throughput: 0: 1126.0. Samples: 2143520. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:25:50,371][03180] Avg episode reward: [(0, '2849.148')] [2024-12-13 04:25:55,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 2146304. Throughput: 0: 1138.7. Samples: 2147460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:25:55,373][03180] Avg episode reward: [(0, '2878.304')] [2024-12-13 04:25:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004192_2146304.pth... [2024-12-13 04:25:55,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004128_2113536.pth [2024-12-13 04:25:55,394][03213] Saving new best policy, reward=2878.304! [2024-12-13 04:26:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2150400. Throughput: 0: 1136.8. Samples: 2153008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:26:00,371][03180] Avg episode reward: [(0, '2966.884')] [2024-12-13 04:26:00,372][03213] Saving new best policy, reward=2966.884! [2024-12-13 04:26:05,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2158592. Throughput: 0: 1126.7. Samples: 2160428. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:26:05,371][03180] Avg episode reward: [(0, '2938.188')] [2024-12-13 04:26:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2162688. Throughput: 0: 1127.3. Samples: 2164264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:26:10,371][03180] Avg episode reward: [(0, '2997.677')] [2024-12-13 04:26:10,486][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004232_2166784.pth... [2024-12-13 04:26:10,494][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004160_2129920.pth [2024-12-13 04:26:10,514][03213] Saving new best policy, reward=2997.677! [2024-12-13 04:26:15,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 2166784. Throughput: 0: 1120.5. Samples: 2169268. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:26:15,374][03180] Avg episode reward: [(0, '3015.951')] [2024-12-13 04:26:15,375][03213] Saving new best policy, reward=3015.951! [2024-12-13 04:26:16,584][03226] Updated weights for policy 0, policy_version 4240 (0.0010) [2024-12-13 04:26:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 2170880. Throughput: 0: 1061.4. Samples: 2174412. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:26:20,371][03180] Avg episode reward: [(0, '3062.533')] [2024-12-13 04:26:20,372][03213] Saving new best policy, reward=3062.533! [2024-12-13 04:26:25,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2179072. Throughput: 0: 1059.4. Samples: 2178188. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:26:25,371][03180] Avg episode reward: [(0, '3054.350')] [2024-12-13 04:26:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004256_2179072.pth... [2024-12-13 04:26:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004192_2146304.pth [2024-12-13 04:26:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2183168. Throughput: 0: 1107.5. Samples: 2185564. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:26:30,371][03180] Avg episode reward: [(0, '3072.746')] [2024-12-13 04:26:30,372][03213] Saving new best policy, reward=3072.746! [2024-12-13 04:26:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2191360. Throughput: 0: 1058.7. Samples: 2191160. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:26:35,371][03180] Avg episode reward: [(0, '3046.511')] [2024-12-13 04:26:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2195456. Throughput: 0: 1057.6. Samples: 2195048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:26:40,371][03180] Avg episode reward: [(0, '3082.948')] [2024-12-13 04:26:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004288_2195456.pth... [2024-12-13 04:26:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004232_2166784.pth [2024-12-13 04:26:40,382][03213] Saving new best policy, reward=3082.948! [2024-12-13 04:26:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2203648. Throughput: 0: 1105.5. Samples: 2202756. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:26:45,371][03180] Avg episode reward: [(0, '3086.835')] [2024-12-13 04:26:45,373][03213] Saving new best policy, reward=3086.835! [2024-12-13 04:26:50,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2207744. Throughput: 0: 1055.4. Samples: 2207920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:26:50,372][03180] Avg episode reward: [(0, '3023.822')] [2024-12-13 04:26:53,081][03226] Updated weights for policy 0, policy_version 4320 (0.0009) [2024-12-13 04:26:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2211840. Throughput: 0: 1057.1. Samples: 2211832. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:26:55,371][03180] Avg episode reward: [(0, '2985.848')] [2024-12-13 04:26:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004320_2211840.pth... [2024-12-13 04:26:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004256_2179072.pth [2024-12-13 04:27:00,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2220032. Throughput: 0: 1119.9. Samples: 2219664. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:27:00,371][03180] Avg episode reward: [(0, '3030.101')] [2024-12-13 04:27:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2224128. Throughput: 0: 1122.0. Samples: 2224900. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:27:05,371][03180] Avg episode reward: [(0, '3010.884')] [2024-12-13 04:27:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2228224. Throughput: 0: 1118.6. Samples: 2228524. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:27:10,371][03180] Avg episode reward: [(0, '3014.754')] [2024-12-13 04:27:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004352_2228224.pth... [2024-12-13 04:27:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004288_2195456.pth [2024-12-13 04:27:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 2236416. Throughput: 0: 1127.0. Samples: 2236280. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:27:15,371][03180] Avg episode reward: [(0, '2995.906')] [2024-12-13 04:27:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 2240512. Throughput: 0: 1130.9. Samples: 2242052. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:27:20,371][03180] Avg episode reward: [(0, '2983.321')] [2024-12-13 04:27:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2244608. Throughput: 0: 1120.2. Samples: 2245456. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:27:25,372][03180] Avg episode reward: [(0, '2984.930')] [2024-12-13 04:27:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004384_2244608.pth... [2024-12-13 04:27:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004320_2211840.pth [2024-12-13 04:27:28,762][03226] Updated weights for policy 0, policy_version 4400 (0.0013) [2024-12-13 04:27:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2252800. Throughput: 0: 1119.8. Samples: 2253148. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:27:30,371][03180] Avg episode reward: [(0, '2914.742')] [2024-12-13 04:27:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2256896. Throughput: 0: 1139.2. Samples: 2259184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:27:35,371][03180] Avg episode reward: [(0, '2889.783')] [2024-12-13 04:27:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2265088. Throughput: 0: 1119.6. Samples: 2262212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:27:40,371][03180] Avg episode reward: [(0, '2848.597')] [2024-12-13 04:27:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004424_2265088.pth... [2024-12-13 04:27:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004352_2228224.pth [2024-12-13 04:27:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2269184. Throughput: 0: 1118.5. Samples: 2269996. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:27:45,371][03180] Avg episode reward: [(0, '2816.666')] [2024-12-13 04:27:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2273280. Throughput: 0: 1144.4. Samples: 2276400. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:27:50,373][03180] Avg episode reward: [(0, '2816.222')] [2024-12-13 04:27:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2281472. Throughput: 0: 1121.3. Samples: 2278984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:27:55,371][03180] Avg episode reward: [(0, '2815.732')] [2024-12-13 04:27:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004456_2281472.pth... [2024-12-13 04:27:55,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004384_2244608.pth [2024-12-13 04:28:00,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2285568. Throughput: 0: 1120.3. Samples: 2286692. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:28:00,371][03180] Avg episode reward: [(0, '2867.217')] [2024-12-13 04:28:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2289664. Throughput: 0: 1140.8. Samples: 2293388. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:28:05,371][03180] Avg episode reward: [(0, '2842.001')] [2024-12-13 04:28:05,731][03226] Updated weights for policy 0, policy_version 4480 (0.0013) [2024-12-13 04:28:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2297856. Throughput: 0: 1117.9. Samples: 2295760. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:28:10,371][03180] Avg episode reward: [(0, '2866.443')] [2024-12-13 04:28:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004488_2297856.pth... [2024-12-13 04:28:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004424_2265088.pth [2024-12-13 04:28:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2301952. Throughput: 0: 1118.8. Samples: 2303496. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:28:15,371][03180] Avg episode reward: [(0, '2876.571')] [2024-12-13 04:28:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 2310144. Throughput: 0: 1136.3. Samples: 2310316. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:28:20,371][03180] Avg episode reward: [(0, '2972.294')] [2024-12-13 04:28:25,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2314240. Throughput: 0: 1127.1. Samples: 2312936. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:28:25,373][03180] Avg episode reward: [(0, '2890.093')] [2024-12-13 04:28:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004520_2314240.pth... [2024-12-13 04:28:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004456_2281472.pth [2024-12-13 04:28:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2318336. Throughput: 0: 1119.3. Samples: 2320364. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:28:30,371][03180] Avg episode reward: [(0, '2958.073')] [2024-12-13 04:28:35,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 2326528. Throughput: 0: 1131.3. Samples: 2327308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:28:35,371][03180] Avg episode reward: [(0, '2911.315')] [2024-12-13 04:28:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2330624. Throughput: 0: 1130.9. Samples: 2329876. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:28:40,371][03180] Avg episode reward: [(0, '2907.271')] [2024-12-13 04:28:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004552_2330624.pth... [2024-12-13 04:28:40,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004488_2297856.pth [2024-12-13 04:28:43,089][03226] Updated weights for policy 0, policy_version 4560 (0.0009) [2024-12-13 04:28:45,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 2334720. Throughput: 0: 1084.6. Samples: 2335500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:28:45,373][03180] Avg episode reward: [(0, '2961.240')] [2024-12-13 04:28:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2338816. Throughput: 0: 1032.3. Samples: 2339840. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:28:50,372][03180] Avg episode reward: [(0, '2961.240')] [2024-12-13 04:28:55,371][03180] Fps is (10 sec: 819.3, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 2342912. Throughput: 0: 1024.1. Samples: 2341844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:28:55,371][03180] Avg episode reward: [(0, '2896.235')] [2024-12-13 04:28:55,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004576_2342912.pth... [2024-12-13 04:28:55,397][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004520_2314240.pth [2024-12-13 04:29:00,371][03180] Fps is (10 sec: 409.6, 60 sec: 955.7, 300 sec: 1083.0). Total num frames: 2342912. Throughput: 0: 936.3. Samples: 2345628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:29:00,372][03180] Avg episode reward: [(0, '2888.598')] [2024-12-13 04:29:05,371][03180] Fps is (10 sec: 409.6, 60 sec: 955.7, 300 sec: 1083.0). Total num frames: 2347008. Throughput: 0: 877.7. Samples: 2349812. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:29:05,375][03180] Avg episode reward: [(0, '2985.908')] [2024-12-13 04:29:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 887.5, 300 sec: 1069.1). Total num frames: 2351104. Throughput: 0: 864.8. Samples: 2351852. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:29:10,371][03180] Avg episode reward: [(0, '3049.182')] [2024-12-13 04:29:10,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004592_2351104.pth... [2024-12-13 04:29:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004552_2330624.pth [2024-12-13 04:29:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 887.5, 300 sec: 1069.1). Total num frames: 2355200. Throughput: 0: 787.1. Samples: 2355784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:29:15,371][03180] Avg episode reward: [(0, '3060.144')] [2024-12-13 04:29:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 819.2, 300 sec: 1069.1). Total num frames: 2359296. Throughput: 0: 718.0. Samples: 2359616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:29:20,371][03180] Avg episode reward: [(0, '3048.494')] [2024-12-13 04:29:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 819.2, 300 sec: 1055.2). Total num frames: 2363392. Throughput: 0: 711.6. Samples: 2361900. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:29:25,371][03180] Avg episode reward: [(0, '3054.075')] [2024-12-13 04:29:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004616_2363392.pth... [2024-12-13 04:29:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004576_2342912.pth [2024-12-13 04:29:30,371][03180] Fps is (10 sec: 409.6, 60 sec: 750.9, 300 sec: 1041.4). Total num frames: 2363392. Throughput: 0: 684.9. Samples: 2366320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:29:30,372][03180] Avg episode reward: [(0, '3085.072')] [2024-12-13 04:29:35,371][03180] Fps is (10 sec: 409.6, 60 sec: 682.7, 300 sec: 1041.4). Total num frames: 2367488. Throughput: 0: 673.6. Samples: 2370152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:29:35,373][03180] Avg episode reward: [(0, '3036.695')] [2024-12-13 04:29:40,372][03180] Fps is (10 sec: 819.1, 60 sec: 682.6, 300 sec: 1027.5). Total num frames: 2371584. Throughput: 0: 669.9. Samples: 2371992. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:29:40,373][03180] Avg episode reward: [(0, '2977.421')] [2024-12-13 04:29:40,387][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004632_2371584.pth... [2024-12-13 04:29:40,397][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004592_2351104.pth [2024-12-13 04:29:43,456][03226] Updated weights for policy 0, policy_version 4640 (0.0013) [2024-12-13 04:29:45,375][03180] Fps is (10 sec: 818.8, 60 sec: 682.6, 300 sec: 1013.6). Total num frames: 2375680. Throughput: 0: 677.6. Samples: 2376124. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:29:45,378][03180] Avg episode reward: [(0, '3007.051')] [2024-12-13 04:29:50,371][03180] Fps is (10 sec: 819.3, 60 sec: 682.7, 300 sec: 1013.6). Total num frames: 2379776. Throughput: 0: 676.5. Samples: 2380256. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:29:50,376][03180] Avg episode reward: [(0, '2978.058')] [2024-12-13 04:29:55,371][03180] Fps is (10 sec: 409.8, 60 sec: 614.4, 300 sec: 999.7). Total num frames: 2379776. Throughput: 0: 677.1. Samples: 2382320. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:29:55,371][03180] Avg episode reward: [(0, '2962.251')] [2024-12-13 04:29:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004648_2379776.pth... [2024-12-13 04:29:55,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004616_2363392.pth [2024-12-13 04:30:00,371][03180] Fps is (10 sec: 409.6, 60 sec: 682.7, 300 sec: 985.8). Total num frames: 2383872. Throughput: 0: 671.1. Samples: 2385984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:30:00,371][03180] Avg episode reward: [(0, '2968.072')] [2024-12-13 04:30:05,373][03180] Fps is (10 sec: 819.0, 60 sec: 682.6, 300 sec: 985.8). Total num frames: 2387968. Throughput: 0: 666.8. Samples: 2389624. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:30:05,373][03180] Avg episode reward: [(0, '2931.328')] [2024-12-13 04:30:10,374][03180] Fps is (10 sec: 818.9, 60 sec: 682.6, 300 sec: 985.8). Total num frames: 2392064. Throughput: 0: 661.5. Samples: 2391668. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:30:10,375][03180] Avg episode reward: [(0, '2870.439')] [2024-12-13 04:30:10,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004672_2392064.pth... [2024-12-13 04:30:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004632_2371584.pth [2024-12-13 04:30:15,371][03180] Fps is (10 sec: 409.7, 60 sec: 614.4, 300 sec: 958.0). Total num frames: 2392064. Throughput: 0: 644.9. Samples: 2395340. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:30:15,371][03180] Avg episode reward: [(0, '2859.434')] [2024-12-13 04:30:20,371][03180] Fps is (10 sec: 409.8, 60 sec: 614.4, 300 sec: 958.1). Total num frames: 2396160. Throughput: 0: 641.1. Samples: 2399000. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:30:20,371][03180] Avg episode reward: [(0, '2843.782')] [2024-12-13 04:30:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 614.4, 300 sec: 958.0). Total num frames: 2400256. Throughput: 0: 640.0. Samples: 2400792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:30:25,371][03180] Avg episode reward: [(0, '2843.266')] [2024-12-13 04:30:25,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004688_2400256.pth... [2024-12-13 04:30:25,395][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004648_2379776.pth [2024-12-13 04:30:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 682.7, 300 sec: 944.2). Total num frames: 2404352. Throughput: 0: 638.3. Samples: 2404844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:30:30,371][03180] Avg episode reward: [(0, '2840.725')] [2024-12-13 04:30:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 682.7, 300 sec: 944.2). Total num frames: 2408448. Throughput: 0: 651.6. Samples: 2409580. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:30:35,372][03180] Avg episode reward: [(0, '2910.803')] [2024-12-13 04:30:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 682.7, 300 sec: 944.2). Total num frames: 2412544. Throughput: 0: 668.7. Samples: 2412412. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:30:40,371][03180] Avg episode reward: [(0, '2968.750')] [2024-12-13 04:30:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004712_2412544.pth... [2024-12-13 04:30:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004672_2392064.pth [2024-12-13 04:30:41,409][03226] Updated weights for policy 0, policy_version 4720 (0.0030) [2024-12-13 04:30:45,371][03180] Fps is (10 sec: 1228.9, 60 sec: 751.0, 300 sec: 944.2). Total num frames: 2420736. Throughput: 0: 758.2. Samples: 2420104. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:30:45,371][03180] Avg episode reward: [(0, '2951.642')] [2024-12-13 04:30:50,371][03180] Fps is (10 sec: 1228.7, 60 sec: 750.9, 300 sec: 944.2). Total num frames: 2424832. Throughput: 0: 810.3. Samples: 2426088. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:30:50,372][03180] Avg episode reward: [(0, '3055.747')] [2024-12-13 04:30:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 819.2, 300 sec: 944.2). Total num frames: 2428928. Throughput: 0: 816.6. Samples: 2428412. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:30:55,372][03180] Avg episode reward: [(0, '3067.473')] [2024-12-13 04:30:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004744_2428928.pth... [2024-12-13 04:30:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004688_2400256.pth [2024-12-13 04:31:00,378][03180] Fps is (10 sec: 1228.0, 60 sec: 887.4, 300 sec: 944.1). Total num frames: 2437120. Throughput: 0: 892.9. Samples: 2435528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:31:00,379][03180] Avg episode reward: [(0, '3183.241')] [2024-12-13 04:31:00,379][03213] Saving new best policy, reward=3183.241! [2024-12-13 04:31:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 887.5, 300 sec: 944.2). Total num frames: 2441216. Throughput: 0: 965.8. Samples: 2442460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:31:05,374][03180] Avg episode reward: [(0, '3202.208')] [2024-12-13 04:31:05,375][03213] Saving new best policy, reward=3202.208! [2024-12-13 04:31:10,371][03180] Fps is (10 sec: 819.8, 60 sec: 887.5, 300 sec: 944.2). Total num frames: 2445312. Throughput: 0: 984.0. Samples: 2445072. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:31:10,371][03180] Avg episode reward: [(0, '3238.894')] [2024-12-13 04:31:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004776_2445312.pth... [2024-12-13 04:31:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004712_2412544.pth [2024-12-13 04:31:10,386][03213] Saving new best policy, reward=3238.894! [2024-12-13 04:31:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 958.0). Total num frames: 2453504. Throughput: 0: 1048.2. Samples: 2452012. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:31:15,371][03180] Avg episode reward: [(0, '3253.856')] [2024-12-13 04:31:15,372][03213] Saving new best policy, reward=3253.856! [2024-12-13 04:31:18,430][03226] Updated weights for policy 0, policy_version 4800 (0.0014) [2024-12-13 04:31:20,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1024.0, 300 sec: 944.2). Total num frames: 2457600. Throughput: 0: 1109.9. Samples: 2459524. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:31:20,372][03180] Avg episode reward: [(0, '3295.294')] [2024-12-13 04:31:20,373][03213] Saving new best policy, reward=3295.294! [2024-12-13 04:31:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 944.2). Total num frames: 2461696. Throughput: 0: 1100.2. Samples: 2461920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:31:25,371][03180] Avg episode reward: [(0, '3340.004')] [2024-12-13 04:31:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004808_2461696.pth... [2024-12-13 04:31:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004744_2428928.pth [2024-12-13 04:31:25,384][03213] Saving new best policy, reward=3340.004! [2024-12-13 04:31:30,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2469888. Throughput: 0: 1079.4. Samples: 2468676. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:31:30,371][03180] Avg episode reward: [(0, '3350.679')] [2024-12-13 04:31:30,372][03213] Saving new best policy, reward=3350.679! [2024-12-13 04:31:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2473984. Throughput: 0: 1116.9. Samples: 2476348. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:31:35,374][03180] Avg episode reward: [(0, '3432.862')] [2024-12-13 04:31:35,375][03213] Saving new best policy, reward=3432.862! [2024-12-13 04:31:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 930.3). Total num frames: 2478080. Throughput: 0: 1119.3. Samples: 2478780. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:31:40,371][03180] Avg episode reward: [(0, '3460.376')] [2024-12-13 04:31:40,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004840_2478080.pth... [2024-12-13 04:31:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004776_2445312.pth [2024-12-13 04:31:40,393][03213] Saving new best policy, reward=3460.376! [2024-12-13 04:31:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2486272. Throughput: 0: 1105.8. Samples: 2485280. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:31:45,371][03180] Avg episode reward: [(0, '3576.974')] [2024-12-13 04:31:45,372][03213] Saving new best policy, reward=3576.974! [2024-12-13 04:31:50,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 944.2). Total num frames: 2490368. Throughput: 0: 1124.1. Samples: 2493048. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:31:50,373][03180] Avg episode reward: [(0, '3593.056')] [2024-12-13 04:31:50,374][03213] Saving new best policy, reward=3593.056! [2024-12-13 04:31:55,375][03180] Fps is (10 sec: 818.8, 60 sec: 1092.2, 300 sec: 930.3). Total num frames: 2494464. Throughput: 0: 1125.9. Samples: 2495744. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:31:55,376][03180] Avg episode reward: [(0, '3620.267')] [2024-12-13 04:31:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004872_2494464.pth... [2024-12-13 04:31:55,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004808_2461696.pth [2024-12-13 04:31:55,388][03213] Saving new best policy, reward=3620.267! [2024-12-13 04:31:56,439][03226] Updated weights for policy 0, policy_version 4880 (0.0009) [2024-12-13 04:32:00,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.4, 300 sec: 944.2). Total num frames: 2502656. Throughput: 0: 1110.0. Samples: 2501964. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:32:00,371][03180] Avg episode reward: [(0, '3644.328')] [2024-12-13 04:32:00,372][03213] Saving new best policy, reward=3644.328! [2024-12-13 04:32:05,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2506752. Throughput: 0: 1111.5. Samples: 2509540. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:32:05,371][03180] Avg episode reward: [(0, '3589.212')] [2024-12-13 04:32:10,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 930.3). Total num frames: 2510848. Throughput: 0: 1125.3. Samples: 2512560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:32:10,373][03180] Avg episode reward: [(0, '3582.788')] [2024-12-13 04:32:10,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004904_2510848.pth... [2024-12-13 04:32:10,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004840_2478080.pth [2024-12-13 04:32:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2519040. Throughput: 0: 1108.0. Samples: 2518536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:32:15,371][03180] Avg episode reward: [(0, '3528.056')] [2024-12-13 04:32:20,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2523136. Throughput: 0: 1105.2. Samples: 2526084. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:32:20,371][03180] Avg episode reward: [(0, '3499.291')] [2024-12-13 04:32:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 930.3). Total num frames: 2527232. Throughput: 0: 1126.4. Samples: 2529468. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:32:25,371][03180] Avg episode reward: [(0, '3471.336')] [2024-12-13 04:32:25,398][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004944_2531328.pth... [2024-12-13 04:32:25,405][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004872_2494464.pth [2024-12-13 04:32:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 944.2). Total num frames: 2535424. Throughput: 0: 1107.7. Samples: 2535128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:32:30,375][03180] Avg episode reward: [(0, '3406.205')] [2024-12-13 04:32:32,565][03226] Updated weights for policy 0, policy_version 4960 (0.0009) [2024-12-13 04:32:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 930.3). Total num frames: 2539520. Throughput: 0: 1105.3. Samples: 2542784. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:32:35,371][03180] Avg episode reward: [(0, '3479.412')] [2024-12-13 04:32:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 944.2). Total num frames: 2547712. Throughput: 0: 1124.0. Samples: 2546320. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:32:40,371][03180] Avg episode reward: [(0, '3497.275')] [2024-12-13 04:32:40,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000004976_2547712.pth... [2024-12-13 04:32:40,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004904_2510848.pth [2024-12-13 04:32:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2551808. Throughput: 0: 1103.9. Samples: 2551640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:32:45,371][03180] Avg episode reward: [(0, '3505.036')] [2024-12-13 04:32:50,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.3, 300 sec: 930.3). Total num frames: 2555904. Throughput: 0: 1108.6. Samples: 2559428. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:32:50,375][03180] Avg episode reward: [(0, '3490.592')] [2024-12-13 04:32:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 944.2). Total num frames: 2564096. Throughput: 0: 1127.8. Samples: 2563308. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:32:55,371][03180] Avg episode reward: [(0, '3527.592')] [2024-12-13 04:32:55,392][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005008_2564096.pth... [2024-12-13 04:32:55,397][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004944_2531328.pth [2024-12-13 04:33:00,374][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 944.2). Total num frames: 2568192. Throughput: 0: 1109.5. Samples: 2568468. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:33:00,375][03180] Avg episode reward: [(0, '3482.341')] [2024-12-13 04:33:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 944.2). Total num frames: 2576384. Throughput: 0: 1113.9. Samples: 2576208. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:33:05,371][03180] Avg episode reward: [(0, '3443.736')] [2024-12-13 04:33:08,446][03226] Updated weights for policy 0, policy_version 5040 (0.0009) [2024-12-13 04:33:10,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.6, 300 sec: 944.2). Total num frames: 2580480. Throughput: 0: 1124.6. Samples: 2580076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:33:10,371][03180] Avg episode reward: [(0, '3460.420')] [2024-12-13 04:33:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005040_2580480.pth... [2024-12-13 04:33:10,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000004976_2547712.pth [2024-12-13 04:33:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 930.3). Total num frames: 2584576. Throughput: 0: 1120.8. Samples: 2585560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:33:15,371][03180] Avg episode reward: [(0, '3412.246')] [2024-12-13 04:33:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 944.2). Total num frames: 2592768. Throughput: 0: 1112.9. Samples: 2592864. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:33:20,371][03180] Avg episode reward: [(0, '3457.795')] [2024-12-13 04:33:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 944.2). Total num frames: 2596864. Throughput: 0: 1122.0. Samples: 2596812. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:33:25,371][03180] Avg episode reward: [(0, '3446.828')] [2024-12-13 04:33:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005072_2596864.pth... [2024-12-13 04:33:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005008_2564096.pth [2024-12-13 04:33:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 930.3). Total num frames: 2600960. Throughput: 0: 1134.5. Samples: 2602692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:33:30,371][03180] Avg episode reward: [(0, '3417.916')] [2024-12-13 04:33:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 944.2). Total num frames: 2609152. Throughput: 0: 1111.6. Samples: 2609448. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:33:35,371][03180] Avg episode reward: [(0, '3435.262')] [2024-12-13 04:33:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2613248. Throughput: 0: 1111.2. Samples: 2613312. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:33:40,371][03180] Avg episode reward: [(0, '3451.967')] [2024-12-13 04:33:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005104_2613248.pth... [2024-12-13 04:33:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005040_2580480.pth [2024-12-13 04:33:45,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 944.2). Total num frames: 2617344. Throughput: 0: 1134.1. Samples: 2619500. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:33:45,373][03180] Avg episode reward: [(0, '3436.909')] [2024-12-13 04:33:47,001][03226] Updated weights for policy 0, policy_version 5120 (0.0014) [2024-12-13 04:33:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 944.2). Total num frames: 2621440. Throughput: 0: 1056.5. Samples: 2623752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:33:50,371][03180] Avg episode reward: [(0, '3488.501')] [2024-12-13 04:33:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 971.9). Total num frames: 2629632. Throughput: 0: 1049.6. Samples: 2627308. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:33:55,371][03180] Avg episode reward: [(0, '3418.367')] [2024-12-13 04:33:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005136_2629632.pth... [2024-12-13 04:33:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005072_2596864.pth [2024-12-13 04:34:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 971.9). Total num frames: 2633728. Throughput: 0: 1096.1. Samples: 2634884. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:34:00,371][03180] Avg episode reward: [(0, '3416.662')] [2024-12-13 04:34:05,374][03180] Fps is (10 sec: 818.9, 60 sec: 1023.9, 300 sec: 971.9). Total num frames: 2637824. Throughput: 0: 1051.6. Samples: 2640188. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:34:05,378][03180] Avg episode reward: [(0, '3418.278')] [2024-12-13 04:34:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 985.8). Total num frames: 2646016. Throughput: 0: 1049.2. Samples: 2644024. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:34:10,371][03180] Avg episode reward: [(0, '3386.698')] [2024-12-13 04:34:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005168_2646016.pth... [2024-12-13 04:34:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005104_2613248.pth [2024-12-13 04:34:15,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 985.8). Total num frames: 2650112. Throughput: 0: 1091.4. Samples: 2651804. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:34:15,372][03180] Avg episode reward: [(0, '3378.143')] [2024-12-13 04:34:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 985.8). Total num frames: 2654208. Throughput: 0: 1059.8. Samples: 2657140. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:34:20,371][03180] Avg episode reward: [(0, '3398.002')] [2024-12-13 04:34:24,230][03226] Updated weights for policy 0, policy_version 5200 (0.0008) [2024-12-13 04:34:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1013.6). Total num frames: 2662400. Throughput: 0: 1058.3. Samples: 2660936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:34:25,371][03180] Avg episode reward: [(0, '3399.630')] [2024-12-13 04:34:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005200_2662400.pth... [2024-12-13 04:34:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005136_2629632.pth [2024-12-13 04:34:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1013.6). Total num frames: 2666496. Throughput: 0: 1091.3. Samples: 2668608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:34:30,371][03180] Avg episode reward: [(0, '3410.624')] [2024-12-13 04:34:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1013.6). Total num frames: 2670592. Throughput: 0: 1122.1. Samples: 2674248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:34:35,372][03180] Avg episode reward: [(0, '3380.050')] [2024-12-13 04:34:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1027.5). Total num frames: 2678784. Throughput: 0: 1119.5. Samples: 2677684. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:34:40,371][03180] Avg episode reward: [(0, '3340.880')] [2024-12-13 04:34:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005232_2678784.pth... [2024-12-13 04:34:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005168_2646016.pth [2024-12-13 04:34:45,371][03180] Fps is (10 sec: 1638.5, 60 sec: 1160.6, 300 sec: 1041.4). Total num frames: 2686976. Throughput: 0: 1120.9. Samples: 2685324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:34:45,373][03180] Avg episode reward: [(0, '3397.050')] [2024-12-13 04:34:50,378][03180] Fps is (10 sec: 1227.9, 60 sec: 1160.4, 300 sec: 1055.2). Total num frames: 2691072. Throughput: 0: 1131.9. Samples: 2691128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:34:50,379][03180] Avg episode reward: [(0, '3413.836')] [2024-12-13 04:34:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1055.2). Total num frames: 2695168. Throughput: 0: 1120.4. Samples: 2694444. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:34:55,371][03180] Avg episode reward: [(0, '3390.426')] [2024-12-13 04:34:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005264_2695168.pth... [2024-12-13 04:34:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005200_2662400.pth [2024-12-13 04:35:00,029][03226] Updated weights for policy 0, policy_version 5280 (0.0010) [2024-12-13 04:35:00,371][03180] Fps is (10 sec: 1229.7, 60 sec: 1160.5, 300 sec: 1069.1). Total num frames: 2703360. Throughput: 0: 1118.6. Samples: 2702140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:35:00,372][03180] Avg episode reward: [(0, '3397.747')] [2024-12-13 04:35:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1069.1). Total num frames: 2707456. Throughput: 0: 1134.7. Samples: 2708200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:35:05,371][03180] Avg episode reward: [(0, '3375.379')] [2024-12-13 04:35:10,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1083.0). Total num frames: 2711552. Throughput: 0: 1113.4. Samples: 2711040. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:35:10,371][03180] Avg episode reward: [(0, '3458.103')] [2024-12-13 04:35:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005296_2711552.pth... [2024-12-13 04:35:10,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005232_2678784.pth [2024-12-13 04:35:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 2719744. Throughput: 0: 1117.7. Samples: 2718904. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:35:15,371][03180] Avg episode reward: [(0, '3434.886')] [2024-12-13 04:35:20,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1096.9). Total num frames: 2723840. Throughput: 0: 1137.2. Samples: 2725424. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:35:20,373][03180] Avg episode reward: [(0, '3416.279')] [2024-12-13 04:35:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2727936. Throughput: 0: 1116.7. Samples: 2727936. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:35:25,371][03180] Avg episode reward: [(0, '3491.639')] [2024-12-13 04:35:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005328_2727936.pth... [2024-12-13 04:35:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005264_2695168.pth [2024-12-13 04:35:30,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2736128. Throughput: 0: 1118.4. Samples: 2735652. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:35:30,371][03180] Avg episode reward: [(0, '3521.572')] [2024-12-13 04:35:35,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2740224. Throughput: 0: 1139.0. Samples: 2742376. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:35:35,373][03180] Avg episode reward: [(0, '3494.009')] [2024-12-13 04:35:37,458][03226] Updated weights for policy 0, policy_version 5360 (0.0009) [2024-12-13 04:35:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2744320. Throughput: 0: 1119.6. Samples: 2744828. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:35:40,371][03180] Avg episode reward: [(0, '3509.448')] [2024-12-13 04:35:40,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005360_2744320.pth... [2024-12-13 04:35:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005296_2711552.pth [2024-12-13 04:35:45,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2752512. Throughput: 0: 1113.7. Samples: 2752256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:35:45,371][03180] Avg episode reward: [(0, '3595.133')] [2024-12-13 04:35:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 2756608. Throughput: 0: 1141.3. Samples: 2759560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:35:50,371][03180] Avg episode reward: [(0, '3690.181')] [2024-12-13 04:35:50,374][03213] Saving new best policy, reward=3690.181! [2024-12-13 04:35:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2760704. Throughput: 0: 1131.8. Samples: 2761972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:35:55,371][03180] Avg episode reward: [(0, '3669.636')] [2024-12-13 04:35:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005392_2760704.pth... [2024-12-13 04:35:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005328_2727936.pth [2024-12-13 04:36:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2768896. Throughput: 0: 1110.9. Samples: 2768896. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:36:00,371][03180] Avg episode reward: [(0, '3719.105')] [2024-12-13 04:36:00,372][03213] Saving new best policy, reward=3719.105! [2024-12-13 04:36:05,376][03180] Fps is (10 sec: 1228.1, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 2772992. Throughput: 0: 1134.0. Samples: 2776460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:36:05,377][03180] Avg episode reward: [(0, '3750.097')] [2024-12-13 04:36:05,378][03213] Saving new best policy, reward=3750.097! [2024-12-13 04:36:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2777088. Throughput: 0: 1133.4. Samples: 2778940. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:36:10,371][03180] Avg episode reward: [(0, '3766.190')] [2024-12-13 04:36:10,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005424_2777088.pth... [2024-12-13 04:36:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005360_2744320.pth [2024-12-13 04:36:10,387][03213] Saving new best policy, reward=3766.190! [2024-12-13 04:36:13,855][03226] Updated weights for policy 0, policy_version 5440 (0.0011) [2024-12-13 04:36:15,371][03180] Fps is (10 sec: 1229.5, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2785280. Throughput: 0: 1107.6. Samples: 2785492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:36:15,371][03180] Avg episode reward: [(0, '3779.863')] [2024-12-13 04:36:15,372][03213] Saving new best policy, reward=3779.863! [2024-12-13 04:36:20,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 2793472. Throughput: 0: 1130.7. Samples: 2793256. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:36:20,371][03180] Avg episode reward: [(0, '3860.412')] [2024-12-13 04:36:20,372][03213] Saving new best policy, reward=3860.412! [2024-12-13 04:36:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2797568. Throughput: 0: 1138.9. Samples: 2796080. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:36:25,371][03180] Avg episode reward: [(0, '3937.095')] [2024-12-13 04:36:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005464_2797568.pth... [2024-12-13 04:36:25,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005392_2760704.pth [2024-12-13 04:36:25,393][03213] Saving new best policy, reward=3937.095! [2024-12-13 04:36:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2801664. Throughput: 0: 1109.3. Samples: 2802176. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:36:30,371][03180] Avg episode reward: [(0, '4006.964')] [2024-12-13 04:36:30,372][03213] Saving new best policy, reward=4006.964! [2024-12-13 04:36:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 2809856. Throughput: 0: 1120.1. Samples: 2809964. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:36:35,371][03180] Avg episode reward: [(0, '3933.586')] [2024-12-13 04:36:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2813952. Throughput: 0: 1138.3. Samples: 2813196. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:36:40,371][03180] Avg episode reward: [(0, '3976.007')] [2024-12-13 04:36:40,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005496_2813952.pth... [2024-12-13 04:36:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005424_2777088.pth [2024-12-13 04:36:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2818048. Throughput: 0: 1112.2. Samples: 2818944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:36:45,372][03180] Avg episode reward: [(0, '4022.465')] [2024-12-13 04:36:45,373][03213] Saving new best policy, reward=4022.465! [2024-12-13 04:36:49,818][03226] Updated weights for policy 0, policy_version 5520 (0.0010) [2024-12-13 04:36:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 2826240. Throughput: 0: 1112.3. Samples: 2826508. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:36:50,371][03180] Avg episode reward: [(0, '4047.878')] [2024-12-13 04:36:50,372][03213] Saving new best policy, reward=4047.878! [2024-12-13 04:36:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2830336. Throughput: 0: 1136.6. Samples: 2830088. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:36:55,371][03180] Avg episode reward: [(0, '4090.664')] [2024-12-13 04:36:55,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005528_2830336.pth... [2024-12-13 04:36:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005464_2797568.pth [2024-12-13 04:36:55,390][03213] Saving new best policy, reward=4090.664! [2024-12-13 04:37:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2834432. Throughput: 0: 1112.4. Samples: 2835552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:37:00,371][03180] Avg episode reward: [(0, '4125.207')] [2024-12-13 04:37:00,372][03213] Saving new best policy, reward=4125.207! [2024-12-13 04:37:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 2842624. Throughput: 0: 1109.1. Samples: 2843164. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:37:05,371][03180] Avg episode reward: [(0, '4134.008')] [2024-12-13 04:37:05,372][03213] Saving new best policy, reward=4134.008! [2024-12-13 04:37:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2846720. Throughput: 0: 1131.2. Samples: 2846984. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:37:10,373][03180] Avg episode reward: [(0, '4144.896')] [2024-12-13 04:37:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005560_2846720.pth... [2024-12-13 04:37:10,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005496_2813952.pth [2024-12-13 04:37:10,391][03213] Saving new best policy, reward=4144.896! [2024-12-13 04:37:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2850816. Throughput: 0: 1112.9. Samples: 2852256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:37:15,371][03180] Avg episode reward: [(0, '4142.689')] [2024-12-13 04:37:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 2859008. Throughput: 0: 1114.4. Samples: 2860112. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:37:20,371][03180] Avg episode reward: [(0, '4212.715')] [2024-12-13 04:37:20,372][03213] Saving new best policy, reward=4212.715! [2024-12-13 04:37:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2863104. Throughput: 0: 1130.0. Samples: 2864048. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:37:25,371][03180] Avg episode reward: [(0, '4235.711')] [2024-12-13 04:37:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005592_2863104.pth... [2024-12-13 04:37:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005528_2830336.pth [2024-12-13 04:37:25,391][03213] Saving new best policy, reward=4235.711! [2024-12-13 04:37:26,676][03226] Updated weights for policy 0, policy_version 5600 (0.0010) [2024-12-13 04:37:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2867200. Throughput: 0: 1118.0. Samples: 2869256. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:37:30,371][03180] Avg episode reward: [(0, '4168.037')] [2024-12-13 04:37:35,375][03180] Fps is (10 sec: 1228.3, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 2875392. Throughput: 0: 1088.6. Samples: 2875500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:37:35,383][03180] Avg episode reward: [(0, '4097.076')] [2024-12-13 04:37:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2879488. Throughput: 0: 1067.5. Samples: 2878124. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:37:40,374][03180] Avg episode reward: [(0, '4131.397')] [2024-12-13 04:37:40,385][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005624_2879488.pth... [2024-12-13 04:37:40,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005560_2846720.pth [2024-12-13 04:37:45,373][03180] Fps is (10 sec: 819.4, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 2883584. Throughput: 0: 1067.3. Samples: 2883584. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:37:45,374][03180] Avg episode reward: [(0, '4098.507')] [2024-12-13 04:37:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 2887680. Throughput: 0: 1068.2. Samples: 2891232. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:37:50,371][03180] Avg episode reward: [(0, '4112.537')] [2024-12-13 04:37:55,372][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 2895872. Throughput: 0: 1069.4. Samples: 2895108. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:37:55,373][03180] Avg episode reward: [(0, '4080.621')] [2024-12-13 04:37:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005656_2895872.pth... [2024-12-13 04:37:55,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005592_2863104.pth [2024-12-13 04:38:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2899968. Throughput: 0: 1080.0. Samples: 2900856. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:38:00,371][03180] Avg episode reward: [(0, '4086.324')] [2024-12-13 04:38:05,186][03226] Updated weights for policy 0, policy_version 5680 (0.0009) [2024-12-13 04:38:05,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2908160. Throughput: 0: 1065.5. Samples: 2908060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:38:05,374][03180] Avg episode reward: [(0, '4050.136')] [2024-12-13 04:38:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2912256. Throughput: 0: 1062.6. Samples: 2911864. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:38:10,372][03180] Avg episode reward: [(0, '4032.756')] [2024-12-13 04:38:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005688_2912256.pth... [2024-12-13 04:38:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005624_2879488.pth [2024-12-13 04:38:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 2916352. Throughput: 0: 1078.4. Samples: 2917784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:38:15,371][03180] Avg episode reward: [(0, '4013.928')] [2024-12-13 04:38:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2924544. Throughput: 0: 1091.0. Samples: 2924592. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:38:20,371][03180] Avg episode reward: [(0, '3928.518')] [2024-12-13 04:38:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2928640. Throughput: 0: 1120.8. Samples: 2928560. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:38:25,371][03180] Avg episode reward: [(0, '3951.433')] [2024-12-13 04:38:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005720_2928640.pth... [2024-12-13 04:38:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005656_2895872.pth [2024-12-13 04:38:30,375][03180] Fps is (10 sec: 818.8, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 2932736. Throughput: 0: 1138.0. Samples: 2934796. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:38:30,376][03180] Avg episode reward: [(0, '3874.331')] [2024-12-13 04:38:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2940928. Throughput: 0: 1112.1. Samples: 2941276. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:38:35,371][03180] Avg episode reward: [(0, '3647.631')] [2024-12-13 04:38:40,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2945024. Throughput: 0: 1111.5. Samples: 2945124. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:38:40,371][03180] Avg episode reward: [(0, '3619.122')] [2024-12-13 04:38:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005752_2945024.pth... [2024-12-13 04:38:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005688_2912256.pth [2024-12-13 04:38:41,244][03226] Updated weights for policy 0, policy_version 5760 (0.0010) [2024-12-13 04:38:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2949120. Throughput: 0: 1129.1. Samples: 2951664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:38:45,371][03180] Avg episode reward: [(0, '3591.221')] [2024-12-13 04:38:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 2957312. Throughput: 0: 1107.4. Samples: 2957892. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:38:50,371][03180] Avg episode reward: [(0, '3645.704')] [2024-12-13 04:38:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2961408. Throughput: 0: 1109.2. Samples: 2961776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:38:55,371][03180] Avg episode reward: [(0, '3571.567')] [2024-12-13 04:38:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005784_2961408.pth... [2024-12-13 04:38:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005720_2928640.pth [2024-12-13 04:39:00,374][03180] Fps is (10 sec: 1228.3, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 2969600. Throughput: 0: 1135.9. Samples: 2968904. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:39:00,377][03180] Avg episode reward: [(0, '3507.805')] [2024-12-13 04:39:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2973696. Throughput: 0: 1113.3. Samples: 2974692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:39:05,371][03180] Avg episode reward: [(0, '3468.588')] [2024-12-13 04:39:10,371][03180] Fps is (10 sec: 819.5, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2977792. Throughput: 0: 1110.6. Samples: 2978536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:39:10,372][03180] Avg episode reward: [(0, '3465.912')] [2024-12-13 04:39:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005816_2977792.pth... [2024-12-13 04:39:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005752_2945024.pth [2024-12-13 04:39:15,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 2985984. Throughput: 0: 1137.6. Samples: 2985984. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:39:15,379][03180] Avg episode reward: [(0, '3469.992')] [2024-12-13 04:39:18,851][03226] Updated weights for policy 0, policy_version 5840 (0.0009) [2024-12-13 04:39:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 2990080. Throughput: 0: 1119.2. Samples: 2991640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:39:20,371][03180] Avg episode reward: [(0, '3422.985')] [2024-12-13 04:39:25,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 2998272. Throughput: 0: 1120.8. Samples: 2995560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:39:25,371][03180] Avg episode reward: [(0, '3354.814')] [2024-12-13 04:39:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005856_2998272.pth... [2024-12-13 04:39:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005784_2961408.pth [2024-12-13 04:39:30,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3002368. Throughput: 0: 1147.6. Samples: 3003308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:39:30,372][03180] Avg episode reward: [(0, '3335.821')] [2024-12-13 04:39:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3006464. Throughput: 0: 1127.8. Samples: 3008644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:39:35,371][03180] Avg episode reward: [(0, '3412.405')] [2024-12-13 04:39:40,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 3014656. Throughput: 0: 1130.6. Samples: 3012652. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:39:40,371][03180] Avg episode reward: [(0, '3445.279')] [2024-12-13 04:39:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005888_3014656.pth... [2024-12-13 04:39:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005816_2977792.pth [2024-12-13 04:39:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 3018752. Throughput: 0: 1147.3. Samples: 3020528. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:39:45,371][03180] Avg episode reward: [(0, '3499.275')] [2024-12-13 04:39:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3022848. Throughput: 0: 1138.2. Samples: 3025912. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:39:50,371][03180] Avg episode reward: [(0, '3656.171')] [2024-12-13 04:39:54,114][03226] Updated weights for policy 0, policy_version 5920 (0.0010) [2024-12-13 04:39:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 3031040. Throughput: 0: 1137.4. Samples: 3029720. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:39:55,373][03180] Avg episode reward: [(0, '3743.662')] [2024-12-13 04:39:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005920_3031040.pth... [2024-12-13 04:39:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005856_2998272.pth [2024-12-13 04:40:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3035136. Throughput: 0: 1144.5. Samples: 3037484. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:40:00,371][03180] Avg episode reward: [(0, '3777.370')] [2024-12-13 04:40:05,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 3039232. Throughput: 0: 1143.4. Samples: 3043092. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:40:05,372][03180] Avg episode reward: [(0, '3778.601')] [2024-12-13 04:40:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 3047424. Throughput: 0: 1133.8. Samples: 3046580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:40:10,371][03180] Avg episode reward: [(0, '3868.762')] [2024-12-13 04:40:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005952_3047424.pth... [2024-12-13 04:40:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005888_3014656.pth [2024-12-13 04:40:15,371][03180] Fps is (10 sec: 1638.6, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3055616. Throughput: 0: 1132.9. Samples: 3054288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:40:15,371][03180] Avg episode reward: [(0, '3843.252')] [2024-12-13 04:40:20,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3059712. Throughput: 0: 1141.6. Samples: 3060016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:40:20,372][03180] Avg episode reward: [(0, '3861.932')] [2024-12-13 04:40:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3063808. Throughput: 0: 1126.7. Samples: 3063352. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:40:25,371][03180] Avg episode reward: [(0, '3944.172')] [2024-12-13 04:40:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000005984_3063808.pth... [2024-12-13 04:40:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005920_3031040.pth [2024-12-13 04:40:29,969][03226] Updated weights for policy 0, policy_version 6000 (0.0010) [2024-12-13 04:40:30,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3072000. Throughput: 0: 1122.0. Samples: 3071016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:40:30,371][03180] Avg episode reward: [(0, '4036.045')] [2024-12-13 04:40:35,378][03180] Fps is (10 sec: 1227.9, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 3076096. Throughput: 0: 1136.4. Samples: 3077060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:40:35,379][03180] Avg episode reward: [(0, '4043.279')] [2024-12-13 04:40:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3080192. Throughput: 0: 1116.3. Samples: 3079952. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:40:40,371][03180] Avg episode reward: [(0, '4061.781')] [2024-12-13 04:40:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006016_3080192.pth... [2024-12-13 04:40:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005952_3047424.pth [2024-12-13 04:40:45,371][03180] Fps is (10 sec: 1229.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3088384. Throughput: 0: 1114.0. Samples: 3087616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:40:45,371][03180] Avg episode reward: [(0, '4128.938')] [2024-12-13 04:40:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3092480. Throughput: 0: 1133.8. Samples: 3094112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:40:50,371][03180] Avg episode reward: [(0, '4141.224')] [2024-12-13 04:40:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3096576. Throughput: 0: 1112.4. Samples: 3096636. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:40:55,371][03180] Avg episode reward: [(0, '4115.833')] [2024-12-13 04:40:55,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006048_3096576.pth... [2024-12-13 04:40:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000005984_3063808.pth [2024-12-13 04:41:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3104768. Throughput: 0: 1113.3. Samples: 3104388. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:41:00,371][03180] Avg episode reward: [(0, '4115.107')] [2024-12-13 04:41:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3108864. Throughput: 0: 1138.3. Samples: 3111240. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:41:05,373][03180] Avg episode reward: [(0, '4038.419')] [2024-12-13 04:41:07,215][03226] Updated weights for policy 0, policy_version 6080 (0.0015) [2024-12-13 04:41:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3112960. Throughput: 0: 1118.1. Samples: 3113668. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:41:10,371][03180] Avg episode reward: [(0, '4044.749')] [2024-12-13 04:41:10,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006080_3112960.pth... [2024-12-13 04:41:10,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006016_3080192.pth [2024-12-13 04:41:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3121152. Throughput: 0: 1112.4. Samples: 3121072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:41:15,371][03180] Avg episode reward: [(0, '3865.853')] [2024-12-13 04:41:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3125248. Throughput: 0: 1138.3. Samples: 3128276. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:41:20,371][03180] Avg episode reward: [(0, '3792.938')] [2024-12-13 04:41:25,377][03180] Fps is (10 sec: 818.7, 60 sec: 1092.1, 300 sec: 1110.8). Total num frames: 3129344. Throughput: 0: 1124.8. Samples: 3130576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:41:25,378][03180] Avg episode reward: [(0, '3800.023')] [2024-12-13 04:41:25,392][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006112_3129344.pth... [2024-12-13 04:41:25,404][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006048_3096576.pth [2024-12-13 04:41:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 3133440. Throughput: 0: 1054.2. Samples: 3135056. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:41:30,371][03180] Avg episode reward: [(0, '3854.561')] [2024-12-13 04:41:35,371][03180] Fps is (10 sec: 1229.6, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 3141632. Throughput: 0: 1081.2. Samples: 3142764. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:41:35,371][03180] Avg episode reward: [(0, '3839.612')] [2024-12-13 04:41:40,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 3145728. Throughput: 0: 1094.5. Samples: 3145888. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:41:40,373][03180] Avg episode reward: [(0, '3872.793')] [2024-12-13 04:41:40,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006144_3145728.pth... [2024-12-13 04:41:40,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006080_3112960.pth [2024-12-13 04:41:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 3149824. Throughput: 0: 1054.7. Samples: 3151848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:41:45,371][03180] Avg episode reward: [(0, '3867.359')] [2024-12-13 04:41:45,682][03226] Updated weights for policy 0, policy_version 6160 (0.0009) [2024-12-13 04:41:50,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3158016. Throughput: 0: 1074.3. Samples: 3159584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:41:50,371][03180] Avg episode reward: [(0, '3841.741')] [2024-12-13 04:41:55,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 3162112. Throughput: 0: 1095.1. Samples: 3162952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:41:55,378][03180] Avg episode reward: [(0, '3859.213')] [2024-12-13 04:41:55,385][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006176_3162112.pth... [2024-12-13 04:41:55,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006112_3129344.pth [2024-12-13 04:42:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 3166208. Throughput: 0: 1056.6. Samples: 3168620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:00,371][03180] Avg episode reward: [(0, '3801.257')] [2024-12-13 04:42:05,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3174400. Throughput: 0: 1067.2. Samples: 3176300. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:42:05,371][03180] Avg episode reward: [(0, '3699.878')] [2024-12-13 04:42:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3178496. Throughput: 0: 1097.1. Samples: 3179940. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:42:10,371][03180] Avg episode reward: [(0, '3752.062')] [2024-12-13 04:42:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006208_3178496.pth... [2024-12-13 04:42:10,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006144_3145728.pth [2024-12-13 04:42:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3186688. Throughput: 0: 1116.2. Samples: 3185284. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:42:15,371][03180] Avg episode reward: [(0, '3752.335')] [2024-12-13 04:42:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3190784. Throughput: 0: 1115.8. Samples: 3192976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:20,371][03180] Avg episode reward: [(0, '3678.554')] [2024-12-13 04:42:21,599][03226] Updated weights for policy 0, policy_version 6240 (0.0010) [2024-12-13 04:42:25,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.6, 300 sec: 1124.6). Total num frames: 3198976. Throughput: 0: 1131.1. Samples: 3196792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:25,376][03180] Avg episode reward: [(0, '3678.920')] [2024-12-13 04:42:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006248_3198976.pth... [2024-12-13 04:42:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006176_3162112.pth [2024-12-13 04:42:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 3203072. Throughput: 0: 1119.9. Samples: 3202244. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:30,371][03180] Avg episode reward: [(0, '3652.059')] [2024-12-13 04:42:35,371][03180] Fps is (10 sec: 819.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3207168. Throughput: 0: 1115.7. Samples: 3209792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:35,371][03180] Avg episode reward: [(0, '3816.632')] [2024-12-13 04:42:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3215360. Throughput: 0: 1125.8. Samples: 3213608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:40,371][03180] Avg episode reward: [(0, '3783.767')] [2024-12-13 04:42:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006280_3215360.pth... [2024-12-13 04:42:40,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006208_3178496.pth [2024-12-13 04:42:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3219456. Throughput: 0: 1127.5. Samples: 3219356. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:45,371][03180] Avg episode reward: [(0, '3742.953')] [2024-12-13 04:42:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3223552. Throughput: 0: 1118.2. Samples: 3226620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:50,371][03180] Avg episode reward: [(0, '3784.724')] [2024-12-13 04:42:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3231744. Throughput: 0: 1125.7. Samples: 3230596. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:42:55,371][03180] Avg episode reward: [(0, '3781.756')] [2024-12-13 04:42:55,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006312_3231744.pth... [2024-12-13 04:42:55,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006248_3198976.pth [2024-12-13 04:42:58,061][03226] Updated weights for policy 0, policy_version 6320 (0.0010) [2024-12-13 04:43:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 3235840. Throughput: 0: 1142.4. Samples: 3236692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:43:00,371][03180] Avg episode reward: [(0, '3762.443')] [2024-12-13 04:43:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3244032. Throughput: 0: 1130.0. Samples: 3243824. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:43:05,371][03180] Avg episode reward: [(0, '3803.856')] [2024-12-13 04:43:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3248128. Throughput: 0: 1131.1. Samples: 3247684. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:43:10,371][03180] Avg episode reward: [(0, '3768.894')] [2024-12-13 04:43:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006344_3248128.pth... [2024-12-13 04:43:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006280_3215360.pth [2024-12-13 04:43:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3252224. Throughput: 0: 1151.8. Samples: 3254076. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:43:15,371][03180] Avg episode reward: [(0, '3711.890')] [2024-12-13 04:43:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3260416. Throughput: 0: 1130.7. Samples: 3260672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:43:20,371][03180] Avg episode reward: [(0, '3694.156')] [2024-12-13 04:43:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 3264512. Throughput: 0: 1132.6. Samples: 3264576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:43:25,371][03180] Avg episode reward: [(0, '3685.098')] [2024-12-13 04:43:25,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006376_3264512.pth... [2024-12-13 04:43:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006312_3231744.pth [2024-12-13 04:43:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3268608. Throughput: 0: 1158.9. Samples: 3271508. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:43:30,371][03180] Avg episode reward: [(0, '3726.055')] [2024-12-13 04:43:34,297][03226] Updated weights for policy 0, policy_version 6400 (0.0009) [2024-12-13 04:43:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3276800. Throughput: 0: 1136.4. Samples: 3277756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:43:35,371][03180] Avg episode reward: [(0, '3786.990')] [2024-12-13 04:43:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3280896. Throughput: 0: 1132.8. Samples: 3281572. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:43:40,372][03180] Avg episode reward: [(0, '3798.726')] [2024-12-13 04:43:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006408_3280896.pth... [2024-12-13 04:43:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006344_3248128.pth [2024-12-13 04:43:45,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3289088. Throughput: 0: 1155.1. Samples: 3288672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:43:45,373][03180] Avg episode reward: [(0, '3733.973')] [2024-12-13 04:43:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3293184. Throughput: 0: 1126.2. Samples: 3294504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:43:50,371][03180] Avg episode reward: [(0, '3720.546')] [2024-12-13 04:43:55,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3297280. Throughput: 0: 1125.9. Samples: 3298348. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:43:55,371][03180] Avg episode reward: [(0, '3802.332')] [2024-12-13 04:43:55,385][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006448_3301376.pth... [2024-12-13 04:43:55,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006376_3264512.pth [2024-12-13 04:44:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3305472. Throughput: 0: 1142.1. Samples: 3305472. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:44:00,371][03180] Avg episode reward: [(0, '3660.440')] [2024-12-13 04:44:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3309568. Throughput: 0: 1123.9. Samples: 3311248. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:44:05,371][03180] Avg episode reward: [(0, '3549.797')] [2024-12-13 04:44:10,191][03226] Updated weights for policy 0, policy_version 6480 (0.0010) [2024-12-13 04:44:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3317760. Throughput: 0: 1120.8. Samples: 3315012. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:44:10,371][03180] Avg episode reward: [(0, '3426.147')] [2024-12-13 04:44:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006480_3317760.pth... [2024-12-13 04:44:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006408_3280896.pth [2024-12-13 04:44:15,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3321856. Throughput: 0: 1131.2. Samples: 3322416. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:44:15,374][03180] Avg episode reward: [(0, '3471.461')] [2024-12-13 04:44:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3325952. Throughput: 0: 1115.6. Samples: 3327960. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:44:20,371][03180] Avg episode reward: [(0, '3542.377')] [2024-12-13 04:44:25,372][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3334144. Throughput: 0: 1116.2. Samples: 3331804. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:44:25,373][03180] Avg episode reward: [(0, '3604.025')] [2024-12-13 04:44:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006512_3334144.pth... [2024-12-13 04:44:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006448_3301376.pth [2024-12-13 04:44:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3338240. Throughput: 0: 1131.5. Samples: 3339588. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:44:30,371][03180] Avg episode reward: [(0, '3621.727')] [2024-12-13 04:44:35,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3342336. Throughput: 0: 1118.8. Samples: 3344848. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:44:35,371][03180] Avg episode reward: [(0, '3602.509')] [2024-12-13 04:44:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3350528. Throughput: 0: 1119.2. Samples: 3348712. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:44:40,371][03180] Avg episode reward: [(0, '3625.587')] [2024-12-13 04:44:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006544_3350528.pth... [2024-12-13 04:44:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006480_3317760.pth [2024-12-13 04:44:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3354624. Throughput: 0: 1134.2. Samples: 3356512. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:44:45,371][03180] Avg episode reward: [(0, '3624.290')] [2024-12-13 04:44:46,259][03226] Updated weights for policy 0, policy_version 6560 (0.0010) [2024-12-13 04:44:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3358720. Throughput: 0: 1126.8. Samples: 3361952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:44:50,371][03180] Avg episode reward: [(0, '3604.034')] [2024-12-13 04:44:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3366912. Throughput: 0: 1121.7. Samples: 3365488. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:44:55,372][03180] Avg episode reward: [(0, '3616.032')] [2024-12-13 04:44:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006576_3366912.pth... [2024-12-13 04:44:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006512_3334144.pth [2024-12-13 04:45:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3371008. Throughput: 0: 1131.0. Samples: 3373308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:45:00,371][03180] Avg episode reward: [(0, '3609.807')] [2024-12-13 04:45:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3379200. Throughput: 0: 1138.3. Samples: 3379184. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:45:05,371][03180] Avg episode reward: [(0, '3648.466')] [2024-12-13 04:45:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3383296. Throughput: 0: 1125.8. Samples: 3382464. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:45:10,371][03180] Avg episode reward: [(0, '3755.552')] [2024-12-13 04:45:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006608_3383296.pth... [2024-12-13 04:45:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006544_3350528.pth [2024-12-13 04:45:15,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3387392. Throughput: 0: 1099.7. Samples: 3389076. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:45:15,374][03180] Avg episode reward: [(0, '3823.692')] [2024-12-13 04:45:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3391488. Throughput: 0: 1092.2. Samples: 3393996. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:45:20,371][03180] Avg episode reward: [(0, '3849.116')] [2024-12-13 04:45:25,214][03226] Updated weights for policy 0, policy_version 6640 (0.0018) [2024-12-13 04:45:25,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3399680. Throughput: 0: 1068.1. Samples: 3396776. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:45:25,371][03180] Avg episode reward: [(0, '3926.821')] [2024-12-13 04:45:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006640_3399680.pth... [2024-12-13 04:45:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006576_3366912.pth [2024-12-13 04:45:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3403776. Throughput: 0: 1068.1. Samples: 3404576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:45:30,373][03180] Avg episode reward: [(0, '3999.148')] [2024-12-13 04:45:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3407872. Throughput: 0: 1099.8. Samples: 3411444. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:45:35,371][03180] Avg episode reward: [(0, '4091.960')] [2024-12-13 04:45:40,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 3416064. Throughput: 0: 1076.5. Samples: 3413932. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:45:40,373][03180] Avg episode reward: [(0, '4110.895')] [2024-12-13 04:45:40,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006672_3416064.pth... [2024-12-13 04:45:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006608_3383296.pth [2024-12-13 04:45:45,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 3420160. Throughput: 0: 1076.8. Samples: 3421768. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:45:45,373][03180] Avg episode reward: [(0, '4132.629')] [2024-12-13 04:45:50,372][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3428352. Throughput: 0: 1101.1. Samples: 3428736. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:45:50,372][03180] Avg episode reward: [(0, '4121.931')] [2024-12-13 04:45:55,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3432448. Throughput: 0: 1087.5. Samples: 3431400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:45:55,371][03180] Avg episode reward: [(0, '4194.992')] [2024-12-13 04:45:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006704_3432448.pth... [2024-12-13 04:45:55,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006640_3399680.pth [2024-12-13 04:46:00,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3436544. Throughput: 0: 1104.2. Samples: 3438764. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:00,371][03180] Avg episode reward: [(0, '4212.487')] [2024-12-13 04:46:00,505][03226] Updated weights for policy 0, policy_version 6720 (0.0009) [2024-12-13 04:46:05,380][03180] Fps is (10 sec: 1227.7, 60 sec: 1092.1, 300 sec: 1124.6). Total num frames: 3444736. Throughput: 0: 1154.6. Samples: 3445964. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:46:05,380][03180] Avg episode reward: [(0, '4229.622')] [2024-12-13 04:46:10,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3448832. Throughput: 0: 1149.9. Samples: 3448524. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:46:10,372][03180] Avg episode reward: [(0, '4272.686')] [2024-12-13 04:46:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006736_3448832.pth... [2024-12-13 04:46:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006672_3416064.pth [2024-12-13 04:46:10,384][03213] Saving new best policy, reward=4272.686! [2024-12-13 04:46:15,371][03180] Fps is (10 sec: 1229.9, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3457024. Throughput: 0: 1129.2. Samples: 3455392. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:46:15,374][03180] Avg episode reward: [(0, '4225.159')] [2024-12-13 04:46:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3461120. Throughput: 0: 1145.1. Samples: 3462976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:20,373][03180] Avg episode reward: [(0, '4193.890')] [2024-12-13 04:46:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3465216. Throughput: 0: 1144.5. Samples: 3465432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:25,371][03180] Avg episode reward: [(0, '4146.328')] [2024-12-13 04:46:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006768_3465216.pth... [2024-12-13 04:46:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006704_3432448.pth [2024-12-13 04:46:30,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3473408. Throughput: 0: 1117.7. Samples: 3472060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:30,371][03180] Avg episode reward: [(0, '3999.651')] [2024-12-13 04:46:35,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3477504. Throughput: 0: 1140.2. Samples: 3480048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:35,375][03180] Avg episode reward: [(0, '4024.140')] [2024-12-13 04:46:36,748][03226] Updated weights for policy 0, policy_version 6800 (0.0010) [2024-12-13 04:46:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3481600. Throughput: 0: 1140.9. Samples: 3482740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:40,371][03180] Avg episode reward: [(0, '4065.194')] [2024-12-13 04:46:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006800_3481600.pth... [2024-12-13 04:46:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006736_3448832.pth [2024-12-13 04:46:45,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3489792. Throughput: 0: 1120.2. Samples: 3489172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:45,372][03180] Avg episode reward: [(0, '4129.204')] [2024-12-13 04:46:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3493888. Throughput: 0: 1134.5. Samples: 3497008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:50,371][03180] Avg episode reward: [(0, '4101.230')] [2024-12-13 04:46:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3497984. Throughput: 0: 1146.9. Samples: 3500132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:46:55,372][03180] Avg episode reward: [(0, '4112.177')] [2024-12-13 04:46:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006832_3497984.pth... [2024-12-13 04:46:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006768_3465216.pth [2024-12-13 04:47:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3506176. Throughput: 0: 1127.9. Samples: 3506148. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:47:00,371][03180] Avg episode reward: [(0, '4022.783')] [2024-12-13 04:47:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 3510272. Throughput: 0: 1132.5. Samples: 3513936. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:47:05,372][03180] Avg episode reward: [(0, '3899.006')] [2024-12-13 04:47:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3518464. Throughput: 0: 1150.8. Samples: 3517216. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:47:10,371][03180] Avg episode reward: [(0, '3852.994')] [2024-12-13 04:47:10,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006872_3518464.pth... [2024-12-13 04:47:10,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006800_3481600.pth [2024-12-13 04:47:13,761][03226] Updated weights for policy 0, policy_version 6880 (0.0009) [2024-12-13 04:47:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3522560. Throughput: 0: 1130.1. Samples: 3522916. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:47:15,371][03180] Avg episode reward: [(0, '3870.120')] [2024-12-13 04:47:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3530752. Throughput: 0: 1126.8. Samples: 3530752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:47:20,371][03180] Avg episode reward: [(0, '3832.593')] [2024-12-13 04:47:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3534848. Throughput: 0: 1153.1. Samples: 3534628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:47:25,372][03180] Avg episode reward: [(0, '3859.145')] [2024-12-13 04:47:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006904_3534848.pth... [2024-12-13 04:47:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006832_3497984.pth [2024-12-13 04:47:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3538944. Throughput: 0: 1128.8. Samples: 3539968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:47:30,377][03180] Avg episode reward: [(0, '3841.581')] [2024-12-13 04:47:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3547136. Throughput: 0: 1129.1. Samples: 3547816. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:47:35,371][03180] Avg episode reward: [(0, '3954.783')] [2024-12-13 04:47:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3551232. Throughput: 0: 1144.5. Samples: 3551636. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:47:40,371][03180] Avg episode reward: [(0, '3981.569')] [2024-12-13 04:47:40,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006936_3551232.pth... [2024-12-13 04:47:40,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006872_3518464.pth [2024-12-13 04:47:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3555328. Throughput: 0: 1129.2. Samples: 3556964. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:47:45,371][03180] Avg episode reward: [(0, '4013.674')] [2024-12-13 04:47:49,113][03226] Updated weights for policy 0, policy_version 6960 (0.0011) [2024-12-13 04:47:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3563520. Throughput: 0: 1129.4. Samples: 3564760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:47:50,371][03180] Avg episode reward: [(0, '4035.208')] [2024-12-13 04:47:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3567616. Throughput: 0: 1142.0. Samples: 3568608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:47:55,371][03180] Avg episode reward: [(0, '3972.467')] [2024-12-13 04:47:55,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000006968_3567616.pth... [2024-12-13 04:47:55,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006904_3534848.pth [2024-12-13 04:48:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3571712. Throughput: 0: 1138.0. Samples: 3574128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:48:00,372][03180] Avg episode reward: [(0, '3931.356')] [2024-12-13 04:48:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3579904. Throughput: 0: 1130.8. Samples: 3581636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:48:05,371][03180] Avg episode reward: [(0, '3891.989')] [2024-12-13 04:48:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3584000. Throughput: 0: 1127.3. Samples: 3585356. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:48:10,371][03180] Avg episode reward: [(0, '3935.512')] [2024-12-13 04:48:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007000_3584000.pth... [2024-12-13 04:48:10,397][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006936_3551232.pth [2024-12-13 04:48:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3592192. Throughput: 0: 1137.5. Samples: 3591156. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:48:15,371][03180] Avg episode reward: [(0, '3932.849')] [2024-12-13 04:48:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3596288. Throughput: 0: 1121.9. Samples: 3598300. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:48:20,371][03180] Avg episode reward: [(0, '3957.132')] [2024-12-13 04:48:24,797][03226] Updated weights for policy 0, policy_version 7040 (0.0010) [2024-12-13 04:48:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 3604480. Throughput: 0: 1122.1. Samples: 3602132. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:48:25,371][03180] Avg episode reward: [(0, '3956.143')] [2024-12-13 04:48:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007040_3604480.pth... [2024-12-13 04:48:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000006968_3567616.pth [2024-12-13 04:48:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3608576. Throughput: 0: 1140.1. Samples: 3608268. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:48:30,376][03180] Avg episode reward: [(0, '3919.819')] [2024-12-13 04:48:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3612672. Throughput: 0: 1124.2. Samples: 3615348. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:48:35,371][03180] Avg episode reward: [(0, '3953.108')] [2024-12-13 04:48:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3620864. Throughput: 0: 1123.4. Samples: 3619160. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:48:40,371][03180] Avg episode reward: [(0, '3977.620')] [2024-12-13 04:48:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007072_3620864.pth... [2024-12-13 04:48:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007000_3584000.pth [2024-12-13 04:48:45,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3624960. Throughput: 0: 1138.2. Samples: 3625348. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:48:45,372][03180] Avg episode reward: [(0, '3950.512')] [2024-12-13 04:48:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3629056. Throughput: 0: 1123.1. Samples: 3632176. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:48:50,371][03180] Avg episode reward: [(0, '3950.863')] [2024-12-13 04:48:55,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3637248. Throughput: 0: 1125.1. Samples: 3635984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:48:55,371][03180] Avg episode reward: [(0, '3941.784')] [2024-12-13 04:48:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007104_3637248.pth... [2024-12-13 04:48:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007040_3604480.pth [2024-12-13 04:49:00,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3641344. Throughput: 0: 1140.5. Samples: 3642480. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:49:00,375][03180] Avg episode reward: [(0, '3971.002')] [2024-12-13 04:49:02,627][03226] Updated weights for policy 0, policy_version 7120 (0.0009) [2024-12-13 04:49:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3645440. Throughput: 0: 1094.3. Samples: 3647544. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:49:05,371][03180] Avg episode reward: [(0, '3970.913')] [2024-12-13 04:49:10,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3649536. Throughput: 0: 1067.8. Samples: 3650184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:49:10,372][03180] Avg episode reward: [(0, '3935.804')] [2024-12-13 04:49:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007128_3649536.pth... [2024-12-13 04:49:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007072_3620864.pth [2024-12-13 04:49:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3657728. Throughput: 0: 1088.2. Samples: 3657236. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:49:15,371][03180] Avg episode reward: [(0, '3955.244')] [2024-12-13 04:49:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3661824. Throughput: 0: 1065.0. Samples: 3663272. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:49:20,371][03180] Avg episode reward: [(0, '3898.852')] [2024-12-13 04:49:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3670016. Throughput: 0: 1067.7. Samples: 3667208. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:49:25,371][03180] Avg episode reward: [(0, '3923.180')] [2024-12-13 04:49:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007168_3670016.pth... [2024-12-13 04:49:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007104_3637248.pth [2024-12-13 04:49:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3674112. Throughput: 0: 1093.8. Samples: 3674568. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:49:30,373][03180] Avg episode reward: [(0, '3863.555')] [2024-12-13 04:49:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3678208. Throughput: 0: 1072.3. Samples: 3680428. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:49:35,371][03180] Avg episode reward: [(0, '3845.312')] [2024-12-13 04:49:39,693][03226] Updated weights for policy 0, policy_version 7200 (0.0010) [2024-12-13 04:49:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3686400. Throughput: 0: 1074.8. Samples: 3684348. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:49:40,371][03180] Avg episode reward: [(0, '3910.089')] [2024-12-13 04:49:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007200_3686400.pth... [2024-12-13 04:49:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007128_3649536.pth [2024-12-13 04:49:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3690496. Throughput: 0: 1096.7. Samples: 3691832. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:49:45,371][03180] Avg episode reward: [(0, '3914.042')] [2024-12-13 04:49:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3694592. Throughput: 0: 1103.4. Samples: 3697196. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:49:50,371][03180] Avg episode reward: [(0, '3911.647')] [2024-12-13 04:49:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3702784. Throughput: 0: 1132.1. Samples: 3701128. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:49:55,372][03180] Avg episode reward: [(0, '3981.547')] [2024-12-13 04:49:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007232_3702784.pth... [2024-12-13 04:49:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007168_3670016.pth [2024-12-13 04:50:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3706880. Throughput: 0: 1147.2. Samples: 3708860. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:00,371][03180] Avg episode reward: [(0, '3976.909')] [2024-12-13 04:50:05,374][03180] Fps is (10 sec: 818.9, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 3710976. Throughput: 0: 1134.1. Samples: 3714312. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:05,377][03180] Avg episode reward: [(0, '3927.557')] [2024-12-13 04:50:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3719168. Throughput: 0: 1127.3. Samples: 3717936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:10,371][03180] Avg episode reward: [(0, '3964.624')] [2024-12-13 04:50:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007264_3719168.pth... [2024-12-13 04:50:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007200_3686400.pth [2024-12-13 04:50:15,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3723264. Throughput: 0: 1134.5. Samples: 3725620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:15,371][03180] Avg episode reward: [(0, '3942.442')] [2024-12-13 04:50:15,395][03226] Updated weights for policy 0, policy_version 7280 (0.0012) [2024-12-13 04:50:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3731456. Throughput: 0: 1134.0. Samples: 3731456. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:50:20,371][03180] Avg episode reward: [(0, '4004.557')] [2024-12-13 04:50:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3735552. Throughput: 0: 1124.2. Samples: 3734936. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:50:25,371][03180] Avg episode reward: [(0, '3983.255')] [2024-12-13 04:50:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007296_3735552.pth... [2024-12-13 04:50:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007232_3702784.pth [2024-12-13 04:50:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 3743744. Throughput: 0: 1127.6. Samples: 3742576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:30,371][03180] Avg episode reward: [(0, '3926.157')] [2024-12-13 04:50:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3747840. Throughput: 0: 1138.6. Samples: 3748432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:35,371][03180] Avg episode reward: [(0, '3966.824')] [2024-12-13 04:50:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3751936. Throughput: 0: 1121.8. Samples: 3751608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:40,371][03180] Avg episode reward: [(0, '3930.252')] [2024-12-13 04:50:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007328_3751936.pth... [2024-12-13 04:50:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007264_3719168.pth [2024-12-13 04:50:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3760128. Throughput: 0: 1122.8. Samples: 3759384. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:45,371][03180] Avg episode reward: [(0, '3903.694')] [2024-12-13 04:50:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3764224. Throughput: 0: 1138.8. Samples: 3765552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:50,371][03180] Avg episode reward: [(0, '3887.450')] [2024-12-13 04:50:53,007][03226] Updated weights for policy 0, policy_version 7360 (0.0010) [2024-12-13 04:50:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3768320. Throughput: 0: 1120.7. Samples: 3768368. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:50:55,371][03180] Avg episode reward: [(0, '3882.261')] [2024-12-13 04:50:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007360_3768320.pth... [2024-12-13 04:50:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007296_3735552.pth [2024-12-13 04:51:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3776512. Throughput: 0: 1121.7. Samples: 3776096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:51:00,371][03180] Avg episode reward: [(0, '3874.058')] [2024-12-13 04:51:05,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3780608. Throughput: 0: 1134.2. Samples: 3782500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:51:05,375][03180] Avg episode reward: [(0, '3897.714')] [2024-12-13 04:51:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3784704. Throughput: 0: 1113.8. Samples: 3785056. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:51:10,371][03180] Avg episode reward: [(0, '3972.325')] [2024-12-13 04:51:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007392_3784704.pth... [2024-12-13 04:51:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007328_3751936.pth [2024-12-13 04:51:15,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3792896. Throughput: 0: 1117.2. Samples: 3792852. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:51:15,371][03180] Avg episode reward: [(0, '3971.788')] [2024-12-13 04:51:20,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3796992. Throughput: 0: 1137.9. Samples: 3799640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:51:20,372][03180] Avg episode reward: [(0, '3865.892')] [2024-12-13 04:51:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3801088. Throughput: 0: 1123.0. Samples: 3802144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:51:25,371][03180] Avg episode reward: [(0, '3778.258')] [2024-12-13 04:51:25,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007424_3801088.pth... [2024-12-13 04:51:25,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007360_3768320.pth [2024-12-13 04:51:28,632][03226] Updated weights for policy 0, policy_version 7440 (0.0009) [2024-12-13 04:51:30,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3809280. Throughput: 0: 1120.1. Samples: 3809788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:51:30,371][03180] Avg episode reward: [(0, '3698.847')] [2024-12-13 04:51:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3813376. Throughput: 0: 1144.6. Samples: 3817060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:51:35,371][03180] Avg episode reward: [(0, '3720.185')] [2024-12-13 04:51:40,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3821568. Throughput: 0: 1136.1. Samples: 3819496. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:51:40,373][03180] Avg episode reward: [(0, '3702.723')] [2024-12-13 04:51:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007464_3821568.pth... [2024-12-13 04:51:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007392_3784704.pth [2024-12-13 04:51:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3825664. Throughput: 0: 1119.5. Samples: 3826472. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:51:45,371][03180] Avg episode reward: [(0, '3784.233')] [2024-12-13 04:51:50,374][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 3833856. Throughput: 0: 1143.6. Samples: 3833960. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:51:50,375][03180] Avg episode reward: [(0, '3826.820')] [2024-12-13 04:51:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3837952. Throughput: 0: 1145.3. Samples: 3836596. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:51:55,371][03180] Avg episode reward: [(0, '3863.464')] [2024-12-13 04:51:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007496_3837952.pth... [2024-12-13 04:51:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007424_3801088.pth [2024-12-13 04:52:00,371][03180] Fps is (10 sec: 819.5, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3842048. Throughput: 0: 1124.5. Samples: 3843456. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:52:00,371][03180] Avg episode reward: [(0, '3782.031')] [2024-12-13 04:52:04,291][03226] Updated weights for policy 0, policy_version 7520 (0.0009) [2024-12-13 04:52:05,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 3850240. Throughput: 0: 1142.8. Samples: 3851068. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:52:05,372][03180] Avg episode reward: [(0, '3816.320')] [2024-12-13 04:52:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3854336. Throughput: 0: 1147.6. Samples: 3853788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:52:10,374][03180] Avg episode reward: [(0, '3749.526')] [2024-12-13 04:52:10,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007528_3854336.pth... [2024-12-13 04:52:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007464_3821568.pth [2024-12-13 04:52:15,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3858432. Throughput: 0: 1119.2. Samples: 3860152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:52:15,371][03180] Avg episode reward: [(0, '3809.422')] [2024-12-13 04:52:20,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3866624. Throughput: 0: 1126.6. Samples: 3867760. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:52:20,372][03180] Avg episode reward: [(0, '3822.423')] [2024-12-13 04:52:25,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3870720. Throughput: 0: 1137.3. Samples: 3870676. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:52:25,374][03180] Avg episode reward: [(0, '3826.292')] [2024-12-13 04:52:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007560_3870720.pth... [2024-12-13 04:52:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007496_3837952.pth [2024-12-13 04:52:30,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3874816. Throughput: 0: 1123.4. Samples: 3877024. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:52:30,371][03180] Avg episode reward: [(0, '3849.892')] [2024-12-13 04:52:35,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3883008. Throughput: 0: 1131.5. Samples: 3884872. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:52:35,371][03180] Avg episode reward: [(0, '3825.130')] [2024-12-13 04:52:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3887104. Throughput: 0: 1143.0. Samples: 3888032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:52:40,371][03180] Avg episode reward: [(0, '3918.489')] [2024-12-13 04:52:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007592_3887104.pth... [2024-12-13 04:52:40,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007528_3854336.pth [2024-12-13 04:52:41,992][03226] Updated weights for policy 0, policy_version 7600 (0.0016) [2024-12-13 04:52:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3895296. Throughput: 0: 1121.4. Samples: 3893920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:52:45,371][03180] Avg episode reward: [(0, '4047.564')] [2024-12-13 04:52:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3899392. Throughput: 0: 1126.2. Samples: 3901748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:52:50,372][03180] Avg episode reward: [(0, '4110.609')] [2024-12-13 04:52:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3903488. Throughput: 0: 1140.2. Samples: 3905096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:52:55,371][03180] Avg episode reward: [(0, '4114.367')] [2024-12-13 04:52:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007624_3903488.pth... [2024-12-13 04:52:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007560_3870720.pth [2024-12-13 04:53:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3907584. Throughput: 0: 1096.3. Samples: 3909484. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:53:00,373][03180] Avg episode reward: [(0, '4151.321')] [2024-12-13 04:53:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3915776. Throughput: 0: 1068.4. Samples: 3915836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:53:05,371][03180] Avg episode reward: [(0, '4151.282')] [2024-12-13 04:53:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3919872. Throughput: 0: 1091.0. Samples: 3919768. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:53:10,371][03180] Avg episode reward: [(0, '4212.346')] [2024-12-13 04:53:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007656_3919872.pth... [2024-12-13 04:53:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007592_3887104.pth [2024-12-13 04:53:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3923968. Throughput: 0: 1084.7. Samples: 3925836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:53:15,371][03180] Avg episode reward: [(0, '4226.395')] [2024-12-13 04:53:19,584][03226] Updated weights for policy 0, policy_version 7680 (0.0012) [2024-12-13 04:53:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3932160. Throughput: 0: 1063.6. Samples: 3932732. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:53:20,371][03180] Avg episode reward: [(0, '4202.612')] [2024-12-13 04:53:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3936256. Throughput: 0: 1079.8. Samples: 3936624. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:53:25,372][03180] Avg episode reward: [(0, '4240.487')] [2024-12-13 04:53:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007688_3936256.pth... [2024-12-13 04:53:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007624_3903488.pth [2024-12-13 04:53:30,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 3940352. Throughput: 0: 1088.9. Samples: 3942924. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:53:30,374][03180] Avg episode reward: [(0, '4267.062')] [2024-12-13 04:53:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3948544. Throughput: 0: 1064.8. Samples: 3949664. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:53:35,371][03180] Avg episode reward: [(0, '4192.565')] [2024-12-13 04:53:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3952640. Throughput: 0: 1076.9. Samples: 3953556. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:53:40,372][03180] Avg episode reward: [(0, '4219.195')] [2024-12-13 04:53:40,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007720_3952640.pth... [2024-12-13 04:53:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007656_3919872.pth [2024-12-13 04:53:45,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 3960832. Throughput: 0: 1125.5. Samples: 3960136. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:53:45,375][03180] Avg episode reward: [(0, '4169.570')] [2024-12-13 04:53:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 3964928. Throughput: 0: 1121.8. Samples: 3966316. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:53:50,371][03180] Avg episode reward: [(0, '4163.582')] [2024-12-13 04:53:55,289][03226] Updated weights for policy 0, policy_version 7760 (0.0016) [2024-12-13 04:53:55,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3973120. Throughput: 0: 1120.4. Samples: 3970184. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:53:55,371][03180] Avg episode reward: [(0, '4122.624')] [2024-12-13 04:53:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007760_3973120.pth... [2024-12-13 04:53:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007688_3936256.pth [2024-12-13 04:54:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3977216. Throughput: 0: 1138.3. Samples: 3977060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:54:00,372][03180] Avg episode reward: [(0, '4132.941')] [2024-12-13 04:54:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 3981312. Throughput: 0: 1122.3. Samples: 3983236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:54:05,371][03180] Avg episode reward: [(0, '4107.601')] [2024-12-13 04:54:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 3989504. Throughput: 0: 1121.8. Samples: 3987104. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:54:10,371][03180] Avg episode reward: [(0, '4126.736')] [2024-12-13 04:54:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007792_3989504.pth... [2024-12-13 04:54:10,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007720_3952640.pth [2024-12-13 04:54:15,377][03180] Fps is (10 sec: 1228.0, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 3993600. Throughput: 0: 1136.6. Samples: 3994076. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:54:15,378][03180] Avg episode reward: [(0, '4080.871')] [2024-12-13 04:54:20,374][03180] Fps is (10 sec: 818.9, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 3997696. Throughput: 0: 1122.3. Samples: 4000172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:54:20,374][03180] Avg episode reward: [(0, '4013.493')] [2024-12-13 04:54:25,371][03180] Fps is (10 sec: 1229.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4005888. Throughput: 0: 1122.3. Samples: 4004060. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:54:25,371][03180] Avg episode reward: [(0, '4010.348')] [2024-12-13 04:54:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007824_4005888.pth... [2024-12-13 04:54:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007760_3973120.pth [2024-12-13 04:54:30,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 4009984. Throughput: 0: 1136.5. Samples: 4011276. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:54:30,372][03180] Avg episode reward: [(0, '4005.026')] [2024-12-13 04:54:32,344][03226] Updated weights for policy 0, policy_version 7840 (0.0009) [2024-12-13 04:54:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4014080. Throughput: 0: 1129.6. Samples: 4017148. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:54:35,371][03180] Avg episode reward: [(0, '4006.386')] [2024-12-13 04:54:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4022272. Throughput: 0: 1131.6. Samples: 4021108. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:54:40,371][03180] Avg episode reward: [(0, '4020.030')] [2024-12-13 04:54:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007856_4022272.pth... [2024-12-13 04:54:40,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007792_3989504.pth [2024-12-13 04:54:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4026368. Throughput: 0: 1146.7. Samples: 4028660. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:54:45,371][03180] Avg episode reward: [(0, '3977.290')] [2024-12-13 04:54:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4034560. Throughput: 0: 1133.2. Samples: 4034232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:54:50,371][03180] Avg episode reward: [(0, '4026.236')] [2024-12-13 04:54:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4038656. Throughput: 0: 1134.4. Samples: 4038152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:54:55,372][03180] Avg episode reward: [(0, '4069.187')] [2024-12-13 04:54:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007888_4038656.pth... [2024-12-13 04:54:55,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007824_4005888.pth [2024-12-13 04:55:00,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1138.6). Total num frames: 4046848. Throughput: 0: 1153.8. Samples: 4045992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:55:00,374][03180] Avg episode reward: [(0, '4057.440')] [2024-12-13 04:55:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4050944. Throughput: 0: 1132.5. Samples: 4051132. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:55:05,371][03180] Avg episode reward: [(0, '4129.189')] [2024-12-13 04:55:08,050][03226] Updated weights for policy 0, policy_version 7920 (0.0009) [2024-12-13 04:55:10,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4055040. Throughput: 0: 1132.9. Samples: 4055040. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:55:10,371][03180] Avg episode reward: [(0, '4187.838')] [2024-12-13 04:55:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007920_4055040.pth... [2024-12-13 04:55:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007856_4022272.pth [2024-12-13 04:55:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.7, 300 sec: 1124.7). Total num frames: 4063232. Throughput: 0: 1146.1. Samples: 4062852. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:55:15,371][03180] Avg episode reward: [(0, '4199.347')] [2024-12-13 04:55:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 4067328. Throughput: 0: 1133.8. Samples: 4068168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:55:20,371][03180] Avg episode reward: [(0, '4184.105')] [2024-12-13 04:55:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4071424. Throughput: 0: 1128.3. Samples: 4071880. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:55:25,371][03180] Avg episode reward: [(0, '4224.950')] [2024-12-13 04:55:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007952_4071424.pth... [2024-12-13 04:55:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007888_4038656.pth [2024-12-13 04:55:30,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4079616. Throughput: 0: 1134.1. Samples: 4079696. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:55:30,373][03180] Avg episode reward: [(0, '4187.309')] [2024-12-13 04:55:35,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4083712. Throughput: 0: 1134.5. Samples: 4085288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:55:35,373][03180] Avg episode reward: [(0, '4219.660')] [2024-12-13 04:55:40,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4091904. Throughput: 0: 1127.6. Samples: 4088896. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:55:40,371][03180] Avg episode reward: [(0, '4245.986')] [2024-12-13 04:55:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000007992_4091904.pth... [2024-12-13 04:55:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007920_4055040.pth [2024-12-13 04:55:43,557][03226] Updated weights for policy 0, policy_version 8000 (0.0009) [2024-12-13 04:55:45,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4096000. Throughput: 0: 1124.5. Samples: 4096592. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:55:45,371][03180] Avg episode reward: [(0, '4228.368')] [2024-12-13 04:55:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4100096. Throughput: 0: 1141.2. Samples: 4102488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:55:50,371][03180] Avg episode reward: [(0, '4163.163')] [2024-12-13 04:55:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4108288. Throughput: 0: 1125.6. Samples: 4105692. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:55:55,372][03180] Avg episode reward: [(0, '4135.052')] [2024-12-13 04:55:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008024_4108288.pth... [2024-12-13 04:55:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007952_4071424.pth [2024-12-13 04:56:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4112384. Throughput: 0: 1121.9. Samples: 4113336. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:56:00,371][03180] Avg episode reward: [(0, '4169.496')] [2024-12-13 04:56:05,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 4116480. Throughput: 0: 1142.8. Samples: 4119596. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:56:05,375][03180] Avg episode reward: [(0, '4190.594')] [2024-12-13 04:56:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4124672. Throughput: 0: 1123.3. Samples: 4122428. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:56:10,371][03180] Avg episode reward: [(0, '4200.229')] [2024-12-13 04:56:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008056_4124672.pth... [2024-12-13 04:56:10,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000007992_4091904.pth [2024-12-13 04:56:15,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4128768. Throughput: 0: 1119.2. Samples: 4130056. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:56:15,371][03180] Avg episode reward: [(0, '4179.511')] [2024-12-13 04:56:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4132864. Throughput: 0: 1138.4. Samples: 4136516. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:56:20,371][03180] Avg episode reward: [(0, '4141.053')] [2024-12-13 04:56:20,860][03226] Updated weights for policy 0, policy_version 8080 (0.0010) [2024-12-13 04:56:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4141056. Throughput: 0: 1113.9. Samples: 4139020. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 04:56:25,371][03180] Avg episode reward: [(0, '4124.882')] [2024-12-13 04:56:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008088_4141056.pth... [2024-12-13 04:56:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008024_4108288.pth [2024-12-13 04:56:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4145152. Throughput: 0: 1113.3. Samples: 4146692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:56:30,371][03180] Avg episode reward: [(0, '4035.669')] [2024-12-13 04:56:35,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4153344. Throughput: 0: 1128.3. Samples: 4153268. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:56:35,376][03180] Avg episode reward: [(0, '3971.795')] [2024-12-13 04:56:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4157440. Throughput: 0: 1111.6. Samples: 4155712. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 04:56:40,371][03180] Avg episode reward: [(0, '3833.152')] [2024-12-13 04:56:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008120_4157440.pth... [2024-12-13 04:56:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008056_4124672.pth [2024-12-13 04:56:45,371][03180] Fps is (10 sec: 819.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4161536. Throughput: 0: 1109.2. Samples: 4163248. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:56:45,371][03180] Avg episode reward: [(0, '3847.529')] [2024-12-13 04:56:50,374][03180] Fps is (10 sec: 818.9, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 4165632. Throughput: 0: 1084.6. Samples: 4168404. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:56:50,375][03180] Avg episode reward: [(0, '3852.219')] [2024-12-13 04:56:55,373][03180] Fps is (10 sec: 819.0, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 4169728. Throughput: 0: 1067.1. Samples: 4170452. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:56:55,374][03180] Avg episode reward: [(0, '3883.147')] [2024-12-13 04:56:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008144_4169728.pth... [2024-12-13 04:56:55,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008088_4141056.pth [2024-12-13 04:56:59,654][03226] Updated weights for policy 0, policy_version 8160 (0.0013) [2024-12-13 04:57:00,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4177920. Throughput: 0: 1047.9. Samples: 4177212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:00,371][03180] Avg episode reward: [(0, '3858.546')] [2024-12-13 04:57:05,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4182016. Throughput: 0: 1074.0. Samples: 4184848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:05,371][03180] Avg episode reward: [(0, '3832.304')] [2024-12-13 04:57:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 4186112. Throughput: 0: 1075.6. Samples: 4187424. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:10,371][03180] Avg episode reward: [(0, '3766.913')] [2024-12-13 04:57:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008176_4186112.pth... [2024-12-13 04:57:10,396][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008120_4157440.pth [2024-12-13 04:57:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4194304. Throughput: 0: 1047.0. Samples: 4193808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:15,372][03180] Avg episode reward: [(0, '3863.624')] [2024-12-13 04:57:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4198400. Throughput: 0: 1073.4. Samples: 4201568. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:20,372][03180] Avg episode reward: [(0, '3818.890')] [2024-12-13 04:57:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 4202496. Throughput: 0: 1085.6. Samples: 4204564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:25,371][03180] Avg episode reward: [(0, '3783.879')] [2024-12-13 04:57:25,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008208_4202496.pth... [2024-12-13 04:57:25,394][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008144_4169728.pth [2024-12-13 04:57:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4210688. Throughput: 0: 1052.4. Samples: 4210604. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:30,375][03180] Avg episode reward: [(0, '3788.017')] [2024-12-13 04:57:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.1, 300 sec: 1110.8). Total num frames: 4214784. Throughput: 0: 1110.0. Samples: 4218352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:35,371][03180] Avg episode reward: [(0, '3852.614')] [2024-12-13 04:57:35,550][03226] Updated weights for policy 0, policy_version 8240 (0.0008) [2024-12-13 04:57:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4222976. Throughput: 0: 1137.4. Samples: 4221632. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:57:40,371][03180] Avg episode reward: [(0, '3875.694')] [2024-12-13 04:57:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008248_4222976.pth... [2024-12-13 04:57:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008176_4186112.pth [2024-12-13 04:57:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4227072. Throughput: 0: 1111.2. Samples: 4227216. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:45,371][03180] Avg episode reward: [(0, '3959.419')] [2024-12-13 04:57:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 4235264. Throughput: 0: 1117.5. Samples: 4235136. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:50,372][03180] Avg episode reward: [(0, '3958.517')] [2024-12-13 04:57:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 4239360. Throughput: 0: 1143.6. Samples: 4238888. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:57:55,371][03180] Avg episode reward: [(0, '3973.588')] [2024-12-13 04:57:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008280_4239360.pth... [2024-12-13 04:57:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008208_4202496.pth [2024-12-13 04:58:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4243456. Throughput: 0: 1121.2. Samples: 4244264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:58:00,371][03180] Avg episode reward: [(0, '4018.853')] [2024-12-13 04:58:05,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4251648. Throughput: 0: 1121.3. Samples: 4252028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:58:05,373][03180] Avg episode reward: [(0, '4143.969')] [2024-12-13 04:58:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4255744. Throughput: 0: 1142.9. Samples: 4255996. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:58:10,371][03180] Avg episode reward: [(0, '4238.452')] [2024-12-13 04:58:10,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008312_4255744.pth... [2024-12-13 04:58:10,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008248_4222976.pth [2024-12-13 04:58:12,597][03226] Updated weights for policy 0, policy_version 8320 (0.0009) [2024-12-13 04:58:15,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4259840. Throughput: 0: 1123.5. Samples: 4261160. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:58:15,371][03180] Avg episode reward: [(0, '4258.881')] [2024-12-13 04:58:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4268032. Throughput: 0: 1126.6. Samples: 4269048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:58:20,371][03180] Avg episode reward: [(0, '4261.410')] [2024-12-13 04:58:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4272128. Throughput: 0: 1139.0. Samples: 4272888. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:58:25,372][03180] Avg episode reward: [(0, '4409.650')] [2024-12-13 04:58:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008344_4272128.pth... [2024-12-13 04:58:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008280_4239360.pth [2024-12-13 04:58:25,388][03213] Saving new best policy, reward=4409.650! [2024-12-13 04:58:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4276224. Throughput: 0: 1136.4. Samples: 4278356. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 04:58:30,371][03180] Avg episode reward: [(0, '4454.596')] [2024-12-13 04:58:30,372][03213] Saving new best policy, reward=4454.596! [2024-12-13 04:58:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4284416. Throughput: 0: 1133.4. Samples: 4286140. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:58:35,371][03180] Avg episode reward: [(0, '4421.512')] [2024-12-13 04:58:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4288512. Throughput: 0: 1135.4. Samples: 4289980. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:58:40,371][03180] Avg episode reward: [(0, '4411.730')] [2024-12-13 04:58:40,460][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008384_4292608.pth... [2024-12-13 04:58:40,474][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008312_4255744.pth [2024-12-13 04:58:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4296704. Throughput: 0: 1144.8. Samples: 4295780. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:58:45,371][03180] Avg episode reward: [(0, '4391.711')] [2024-12-13 04:58:48,326][03226] Updated weights for policy 0, policy_version 8400 (0.0009) [2024-12-13 04:58:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4300800. Throughput: 0: 1134.4. Samples: 4303076. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:58:50,371][03180] Avg episode reward: [(0, '4364.786')] [2024-12-13 04:58:55,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 4308992. Throughput: 0: 1134.2. Samples: 4307040. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:58:55,376][03180] Avg episode reward: [(0, '4366.896')] [2024-12-13 04:58:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008416_4308992.pth... [2024-12-13 04:58:55,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008344_4272128.pth [2024-12-13 04:59:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4313088. Throughput: 0: 1152.9. Samples: 4313040. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:59:00,372][03180] Avg episode reward: [(0, '4344.321')] [2024-12-13 04:59:05,371][03180] Fps is (10 sec: 819.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4317184. Throughput: 0: 1136.0. Samples: 4320168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:59:05,371][03180] Avg episode reward: [(0, '4264.162')] [2024-12-13 04:59:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4325376. Throughput: 0: 1137.4. Samples: 4324072. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:59:10,373][03180] Avg episode reward: [(0, '4225.597')] [2024-12-13 04:59:10,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008448_4325376.pth... [2024-12-13 04:59:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008384_4292608.pth [2024-12-13 04:59:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4329472. Throughput: 0: 1153.4. Samples: 4330260. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 04:59:15,371][03180] Avg episode reward: [(0, '4245.307')] [2024-12-13 04:59:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4333568. Throughput: 0: 1133.0. Samples: 4337124. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:59:20,371][03180] Avg episode reward: [(0, '4243.212')] [2024-12-13 04:59:23,652][03226] Updated weights for policy 0, policy_version 8480 (0.0009) [2024-12-13 04:59:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4341760. Throughput: 0: 1134.0. Samples: 4341008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:59:25,371][03180] Avg episode reward: [(0, '4184.225')] [2024-12-13 04:59:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008480_4341760.pth... [2024-12-13 04:59:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008416_4308992.pth [2024-12-13 04:59:30,374][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4345856. Throughput: 0: 1152.6. Samples: 4347648. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 04:59:30,374][03180] Avg episode reward: [(0, '4170.184')] [2024-12-13 04:59:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4354048. Throughput: 0: 1135.8. Samples: 4354188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:59:35,371][03180] Avg episode reward: [(0, '4141.814')] [2024-12-13 04:59:40,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4358144. Throughput: 0: 1135.9. Samples: 4358152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:59:40,374][03180] Avg episode reward: [(0, '4144.105')] [2024-12-13 04:59:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008512_4358144.pth... [2024-12-13 04:59:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008448_4325376.pth [2024-12-13 04:59:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4362240. Throughput: 0: 1160.7. Samples: 4365272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 04:59:45,371][03180] Avg episode reward: [(0, '4112.033')] [2024-12-13 04:59:50,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4370432. Throughput: 0: 1135.9. Samples: 4371284. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 04:59:50,371][03180] Avg episode reward: [(0, '4063.740')] [2024-12-13 04:59:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4374528. Throughput: 0: 1135.6. Samples: 4375176. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 04:59:55,371][03180] Avg episode reward: [(0, '4110.657')] [2024-12-13 04:59:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008544_4374528.pth... [2024-12-13 04:59:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008480_4341760.pth [2024-12-13 04:59:59,290][03226] Updated weights for policy 0, policy_version 8560 (0.0011) [2024-12-13 05:00:00,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4382720. Throughput: 0: 1163.1. Samples: 4382600. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:00:00,373][03180] Avg episode reward: [(0, '4222.969')] [2024-12-13 05:00:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4386816. Throughput: 0: 1132.8. Samples: 4388100. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:00:05,371][03180] Avg episode reward: [(0, '4258.312')] [2024-12-13 05:00:10,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4390912. Throughput: 0: 1131.5. Samples: 4391924. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:00:10,371][03180] Avg episode reward: [(0, '4333.974')] [2024-12-13 05:00:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008576_4390912.pth... [2024-12-13 05:00:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008512_4358144.pth [2024-12-13 05:00:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4399104. Throughput: 0: 1147.4. Samples: 4399280. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:00:15,372][03180] Avg episode reward: [(0, '4288.035')] [2024-12-13 05:00:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4403200. Throughput: 0: 1123.8. Samples: 4404760. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:00:20,371][03180] Avg episode reward: [(0, '4267.284')] [2024-12-13 05:00:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4411392. Throughput: 0: 1121.8. Samples: 4408628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:00:25,372][03180] Avg episode reward: [(0, '4312.906')] [2024-12-13 05:00:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008616_4411392.pth... [2024-12-13 05:00:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008544_4374528.pth [2024-12-13 05:00:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4415488. Throughput: 0: 1138.4. Samples: 4416500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:00:30,379][03180] Avg episode reward: [(0, '4303.014')] [2024-12-13 05:00:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4419584. Throughput: 0: 1123.5. Samples: 4421840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:00:35,371][03180] Avg episode reward: [(0, '4299.195')] [2024-12-13 05:00:36,483][03226] Updated weights for policy 0, policy_version 8640 (0.0009) [2024-12-13 05:00:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4423680. Throughput: 0: 1120.5. Samples: 4425600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:00:40,371][03180] Avg episode reward: [(0, '4321.988')] [2024-12-13 05:00:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008640_4423680.pth... [2024-12-13 05:00:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008576_4390912.pth [2024-12-13 05:00:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4431872. Throughput: 0: 1070.0. Samples: 4430748. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:00:45,371][03180] Avg episode reward: [(0, '4287.685')] [2024-12-13 05:00:50,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 4435968. Throughput: 0: 1071.7. Samples: 4436332. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:00:50,376][03180] Avg episode reward: [(0, '4344.140')] [2024-12-13 05:00:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4440064. Throughput: 0: 1067.7. Samples: 4439972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:00:55,371][03180] Avg episode reward: [(0, '4461.362')] [2024-12-13 05:00:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008672_4440064.pth... [2024-12-13 05:00:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008616_4411392.pth [2024-12-13 05:00:55,383][03213] Saving new best policy, reward=4461.362! [2024-12-13 05:01:00,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4448256. Throughput: 0: 1079.6. Samples: 4447860. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:01:00,373][03180] Avg episode reward: [(0, '4549.353')] [2024-12-13 05:01:00,374][03213] Saving new best policy, reward=4549.353! [2024-12-13 05:01:05,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 4452352. Throughput: 0: 1087.4. Samples: 4453700. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:01:05,376][03180] Avg episode reward: [(0, '4566.553')] [2024-12-13 05:01:05,378][03213] Saving new best policy, reward=4566.553! [2024-12-13 05:01:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4456448. Throughput: 0: 1075.0. Samples: 4457004. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:01:10,371][03180] Avg episode reward: [(0, '4580.902')] [2024-12-13 05:01:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008704_4456448.pth... [2024-12-13 05:01:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008640_4423680.pth [2024-12-13 05:01:10,387][03213] Saving new best policy, reward=4580.902! [2024-12-13 05:01:13,846][03226] Updated weights for policy 0, policy_version 8720 (0.0009) [2024-12-13 05:01:15,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4464640. Throughput: 0: 1075.3. Samples: 4464888. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:01:15,371][03180] Avg episode reward: [(0, '4628.305')] [2024-12-13 05:01:15,372][03213] Saving new best policy, reward=4628.305! [2024-12-13 05:01:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4468736. Throughput: 0: 1095.4. Samples: 4471132. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:01:20,374][03180] Avg episode reward: [(0, '4628.489')] [2024-12-13 05:01:20,375][03213] Saving new best policy, reward=4628.489! [2024-12-13 05:01:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4476928. Throughput: 0: 1080.1. Samples: 4474204. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:01:25,371][03180] Avg episode reward: [(0, '4639.173')] [2024-12-13 05:01:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008744_4476928.pth... [2024-12-13 05:01:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008672_4440064.pth [2024-12-13 05:01:25,382][03213] Saving new best policy, reward=4639.173! [2024-12-13 05:01:30,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 4481024. Throughput: 0: 1142.3. Samples: 4482152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:01:30,373][03180] Avg episode reward: [(0, '4648.370')] [2024-12-13 05:01:30,374][03213] Saving new best policy, reward=4648.370! [2024-12-13 05:01:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4485120. Throughput: 0: 1162.2. Samples: 4488624. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:01:35,372][03180] Avg episode reward: [(0, '4635.671')] [2024-12-13 05:01:40,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4493312. Throughput: 0: 1143.5. Samples: 4491428. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:01:40,371][03180] Avg episode reward: [(0, '4639.451')] [2024-12-13 05:01:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008776_4493312.pth... [2024-12-13 05:01:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008704_4456448.pth [2024-12-13 05:01:45,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 4497408. Throughput: 0: 1141.6. Samples: 4499232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:01:45,373][03180] Avg episode reward: [(0, '4672.415')] [2024-12-13 05:01:45,374][03213] Saving new best policy, reward=4672.415! [2024-12-13 05:01:49,760][03226] Updated weights for policy 0, policy_version 8800 (0.0019) [2024-12-13 05:01:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.6). Total num frames: 4505600. Throughput: 0: 1159.1. Samples: 4505852. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:01:50,371][03180] Avg episode reward: [(0, '4709.771')] [2024-12-13 05:01:50,372][03213] Saving new best policy, reward=4709.771! [2024-12-13 05:01:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4509696. Throughput: 0: 1142.8. Samples: 4508432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:01:55,371][03180] Avg episode reward: [(0, '4769.377')] [2024-12-13 05:01:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008808_4509696.pth... [2024-12-13 05:01:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008744_4476928.pth [2024-12-13 05:01:55,386][03213] Saving new best policy, reward=4769.377! [2024-12-13 05:02:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 4517888. Throughput: 0: 1140.3. Samples: 4516200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:02:00,371][03180] Avg episode reward: [(0, '4801.130')] [2024-12-13 05:02:00,372][03213] Saving new best policy, reward=4801.130! [2024-12-13 05:02:05,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 4521984. Throughput: 0: 1152.4. Samples: 4522992. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:02:05,373][03180] Avg episode reward: [(0, '4810.744')] [2024-12-13 05:02:05,374][03213] Saving new best policy, reward=4810.744! [2024-12-13 05:02:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4526080. Throughput: 0: 1142.3. Samples: 4525608. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:02:10,371][03180] Avg episode reward: [(0, '4702.069')] [2024-12-13 05:02:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008840_4526080.pth... [2024-12-13 05:02:10,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008776_4493312.pth [2024-12-13 05:02:15,378][03180] Fps is (10 sec: 1228.0, 60 sec: 1160.4, 300 sec: 1138.5). Total num frames: 4534272. Throughput: 0: 1129.1. Samples: 4532968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:02:15,379][03180] Avg episode reward: [(0, '4637.572')] [2024-12-13 05:02:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 4538368. Throughput: 0: 1140.9. Samples: 4539964. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:02:20,371][03180] Avg episode reward: [(0, '4614.042')] [2024-12-13 05:02:25,371][03180] Fps is (10 sec: 819.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4542464. Throughput: 0: 1134.1. Samples: 4542464. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:02:25,371][03180] Avg episode reward: [(0, '4497.350')] [2024-12-13 05:02:25,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008872_4542464.pth... [2024-12-13 05:02:25,395][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008808_4509696.pth [2024-12-13 05:02:26,796][03226] Updated weights for policy 0, policy_version 8880 (0.0011) [2024-12-13 05:02:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 4550656. Throughput: 0: 1116.3. Samples: 4549464. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:02:30,371][03180] Avg episode reward: [(0, '4448.082')] [2024-12-13 05:02:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4554752. Throughput: 0: 1128.9. Samples: 4556652. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:02:35,371][03180] Avg episode reward: [(0, '4446.858')] [2024-12-13 05:02:40,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 4558848. Throughput: 0: 1128.1. Samples: 4559200. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:02:40,373][03180] Avg episode reward: [(0, '4447.406')] [2024-12-13 05:02:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008904_4558848.pth... [2024-12-13 05:02:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008840_4526080.pth [2024-12-13 05:02:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 4567040. Throughput: 0: 1106.0. Samples: 4565968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:02:45,371][03180] Avg episode reward: [(0, '4482.143')] [2024-12-13 05:02:50,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4571136. Throughput: 0: 1123.6. Samples: 4573552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:02:50,371][03180] Avg episode reward: [(0, '4436.212')] [2024-12-13 05:02:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4575232. Throughput: 0: 1120.2. Samples: 4576016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:02:55,371][03180] Avg episode reward: [(0, '4395.218')] [2024-12-13 05:02:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008936_4575232.pth... [2024-12-13 05:02:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008872_4542464.pth [2024-12-13 05:03:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4583424. Throughput: 0: 1102.6. Samples: 4582576. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:03:00,372][03180] Avg episode reward: [(0, '4383.332')] [2024-12-13 05:03:02,931][03226] Updated weights for policy 0, policy_version 8960 (0.0009) [2024-12-13 05:03:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4587520. Throughput: 0: 1118.6. Samples: 4590300. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:03:05,371][03180] Avg episode reward: [(0, '4305.248')] [2024-12-13 05:03:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4591616. Throughput: 0: 1125.3. Samples: 4593104. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:03:10,371][03180] Avg episode reward: [(0, '4180.454')] [2024-12-13 05:03:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000008968_4591616.pth... [2024-12-13 05:03:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008904_4558848.pth [2024-12-13 05:03:15,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 4599808. Throughput: 0: 1107.3. Samples: 4599292. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:03:15,372][03180] Avg episode reward: [(0, '4142.147')] [2024-12-13 05:03:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4603904. Throughput: 0: 1123.0. Samples: 4607188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:03:20,371][03180] Avg episode reward: [(0, '4033.343')] [2024-12-13 05:03:25,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 4612096. Throughput: 0: 1133.9. Samples: 4610228. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:03:25,374][03180] Avg episode reward: [(0, '3896.119')] [2024-12-13 05:03:25,385][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009008_4612096.pth... [2024-12-13 05:03:25,394][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008936_4575232.pth [2024-12-13 05:03:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4616192. Throughput: 0: 1116.1. Samples: 4616192. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:03:30,371][03180] Avg episode reward: [(0, '3948.562')] [2024-12-13 05:03:35,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4620288. Throughput: 0: 1121.5. Samples: 4624020. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:03:35,371][03180] Avg episode reward: [(0, '3888.190')] [2024-12-13 05:03:39,446][03226] Updated weights for policy 0, policy_version 9040 (0.0010) [2024-12-13 05:03:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 4628480. Throughput: 0: 1141.1. Samples: 4627364. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:03:40,371][03180] Avg episode reward: [(0, '3897.511')] [2024-12-13 05:03:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009040_4628480.pth... [2024-12-13 05:03:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000008968_4591616.pth [2024-12-13 05:03:45,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 4632576. Throughput: 0: 1116.7. Samples: 4632828. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:03:45,373][03180] Avg episode reward: [(0, '3926.001')] [2024-12-13 05:03:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4640768. Throughput: 0: 1116.8. Samples: 4640556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:03:50,371][03180] Avg episode reward: [(0, '4016.636')] [2024-12-13 05:03:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4644864. Throughput: 0: 1140.2. Samples: 4644412. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:03:55,371][03180] Avg episode reward: [(0, '4042.255')] [2024-12-13 05:03:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009072_4644864.pth... [2024-12-13 05:03:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009008_4612096.pth [2024-12-13 05:04:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4648960. Throughput: 0: 1118.9. Samples: 4649640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:04:00,371][03180] Avg episode reward: [(0, '4099.871')] [2024-12-13 05:04:05,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4657152. Throughput: 0: 1114.8. Samples: 4657356. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:04:05,373][03180] Avg episode reward: [(0, '4085.249')] [2024-12-13 05:04:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4661248. Throughput: 0: 1134.4. Samples: 4661272. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:04:10,371][03180] Avg episode reward: [(0, '4143.505')] [2024-12-13 05:04:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009104_4661248.pth... [2024-12-13 05:04:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009040_4628480.pth [2024-12-13 05:04:15,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4665344. Throughput: 0: 1119.1. Samples: 4666552. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:04:15,371][03180] Avg episode reward: [(0, '4170.197')] [2024-12-13 05:04:16,470][03226] Updated weights for policy 0, policy_version 9120 (0.0012) [2024-12-13 05:04:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4673536. Throughput: 0: 1110.9. Samples: 4674012. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:04:20,371][03180] Avg episode reward: [(0, '4211.707')] [2024-12-13 05:04:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4677632. Throughput: 0: 1124.1. Samples: 4677948. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:04:25,371][03180] Avg episode reward: [(0, '4280.556')] [2024-12-13 05:04:25,387][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009136_4677632.pth... [2024-12-13 05:04:25,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009072_4644864.pth [2024-12-13 05:04:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4681728. Throughput: 0: 1129.5. Samples: 4683652. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:04:30,371][03180] Avg episode reward: [(0, '4267.551')] [2024-12-13 05:04:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4685824. Throughput: 0: 1085.4. Samples: 4689400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:04:35,371][03180] Avg episode reward: [(0, '4282.338')] [2024-12-13 05:04:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4694016. Throughput: 0: 1064.5. Samples: 4692316. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:04:40,371][03180] Avg episode reward: [(0, '4351.922')] [2024-12-13 05:04:40,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009168_4694016.pth... [2024-12-13 05:04:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009104_4661248.pth [2024-12-13 05:04:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4698112. Throughput: 0: 1077.2. Samples: 4698112. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:04:45,372][03180] Avg episode reward: [(0, '4354.816')] [2024-12-13 05:04:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 4702208. Throughput: 0: 1063.8. Samples: 4705224. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:04:50,371][03180] Avg episode reward: [(0, '4469.328')] [2024-12-13 05:04:54,077][03226] Updated weights for policy 0, policy_version 9200 (0.0014) [2024-12-13 05:04:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4710400. Throughput: 0: 1064.4. Samples: 4709168. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:04:55,371][03180] Avg episode reward: [(0, '4516.518')] [2024-12-13 05:04:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009200_4710400.pth... [2024-12-13 05:04:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009136_4677632.pth [2024-12-13 05:05:00,382][03180] Fps is (10 sec: 1227.5, 60 sec: 1092.1, 300 sec: 1110.7). Total num frames: 4714496. Throughput: 0: 1084.0. Samples: 4715344. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:05:00,384][03180] Avg episode reward: [(0, '4519.234')] [2024-12-13 05:05:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 4718592. Throughput: 0: 1072.4. Samples: 4722272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:05:05,371][03180] Avg episode reward: [(0, '4580.602')] [2024-12-13 05:05:10,371][03180] Fps is (10 sec: 1230.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4726784. Throughput: 0: 1073.3. Samples: 4726248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:05:10,371][03180] Avg episode reward: [(0, '4584.437')] [2024-12-13 05:05:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009232_4726784.pth... [2024-12-13 05:05:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009168_4694016.pth [2024-12-13 05:05:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4730880. Throughput: 0: 1089.6. Samples: 4732684. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:05:15,371][03180] Avg episode reward: [(0, '4586.785')] [2024-12-13 05:05:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4739072. Throughput: 0: 1110.1. Samples: 4739356. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:05:20,371][03180] Avg episode reward: [(0, '4586.470')] [2024-12-13 05:05:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4743168. Throughput: 0: 1131.7. Samples: 4743244. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:05:25,371][03180] Avg episode reward: [(0, '4561.046')] [2024-12-13 05:05:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009264_4743168.pth... [2024-12-13 05:05:25,396][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009200_4710400.pth [2024-12-13 05:05:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4747264. Throughput: 0: 1157.0. Samples: 4750176. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:05:30,371][03180] Avg episode reward: [(0, '4515.236')] [2024-12-13 05:05:30,646][03226] Updated weights for policy 0, policy_version 9280 (0.0017) [2024-12-13 05:05:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4755456. Throughput: 0: 1134.8. Samples: 4756288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:05:35,371][03180] Avg episode reward: [(0, '4465.484')] [2024-12-13 05:05:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4759552. Throughput: 0: 1131.6. Samples: 4760088. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:05:40,371][03180] Avg episode reward: [(0, '4419.144')] [2024-12-13 05:05:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009296_4759552.pth... [2024-12-13 05:05:40,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009232_4726784.pth [2024-12-13 05:05:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4767744. Throughput: 0: 1152.9. Samples: 4767212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:05:45,372][03180] Avg episode reward: [(0, '4433.553')] [2024-12-13 05:05:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4771840. Throughput: 0: 1129.1. Samples: 4773080. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:05:50,371][03180] Avg episode reward: [(0, '4448.487')] [2024-12-13 05:05:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4780032. Throughput: 0: 1126.1. Samples: 4776924. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:05:55,374][03180] Avg episode reward: [(0, '4433.073')] [2024-12-13 05:05:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009336_4780032.pth... [2024-12-13 05:05:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009264_4743168.pth [2024-12-13 05:06:00,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.7, 300 sec: 1124.7). Total num frames: 4784128. Throughput: 0: 1146.2. Samples: 4784264. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:06:00,373][03180] Avg episode reward: [(0, '4511.290')] [2024-12-13 05:06:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4788224. Throughput: 0: 1127.7. Samples: 4790104. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:06:05,371][03180] Avg episode reward: [(0, '4510.019')] [2024-12-13 05:06:06,792][03226] Updated weights for policy 0, policy_version 9360 (0.0009) [2024-12-13 05:06:10,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4796416. Throughput: 0: 1127.8. Samples: 4793996. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:06:10,371][03180] Avg episode reward: [(0, '4531.216')] [2024-12-13 05:06:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009368_4796416.pth... [2024-12-13 05:06:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009296_4759552.pth [2024-12-13 05:06:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4800512. Throughput: 0: 1138.0. Samples: 4801388. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:06:15,371][03180] Avg episode reward: [(0, '4598.569')] [2024-12-13 05:06:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4804608. Throughput: 0: 1119.4. Samples: 4806660. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:06:20,371][03180] Avg episode reward: [(0, '4591.395')] [2024-12-13 05:06:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4812800. Throughput: 0: 1120.5. Samples: 4810512. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:06:25,371][03180] Avg episode reward: [(0, '4647.816')] [2024-12-13 05:06:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009400_4812800.pth... [2024-12-13 05:06:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009336_4780032.pth [2024-12-13 05:06:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4816896. Throughput: 0: 1137.0. Samples: 4818376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:06:30,372][03180] Avg episode reward: [(0, '4676.893')] [2024-12-13 05:06:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4820992. Throughput: 0: 1120.5. Samples: 4823504. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:06:35,371][03180] Avg episode reward: [(0, '4667.980')] [2024-12-13 05:06:40,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4829184. Throughput: 0: 1121.0. Samples: 4827372. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:06:40,374][03180] Avg episode reward: [(0, '4697.407')] [2024-12-13 05:06:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009432_4829184.pth... [2024-12-13 05:06:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009368_4796416.pth [2024-12-13 05:06:42,598][03226] Updated weights for policy 0, policy_version 9440 (0.0009) [2024-12-13 05:06:45,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 4833280. Throughput: 0: 1130.6. Samples: 4835140. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:06:45,374][03180] Avg episode reward: [(0, '4712.989')] [2024-12-13 05:06:50,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4837376. Throughput: 0: 1117.3. Samples: 4840384. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:06:50,371][03180] Avg episode reward: [(0, '4714.363')] [2024-12-13 05:06:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4845568. Throughput: 0: 1113.8. Samples: 4844116. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:06:55,371][03180] Avg episode reward: [(0, '4768.932')] [2024-12-13 05:06:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009464_4845568.pth... [2024-12-13 05:06:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009400_4812800.pth [2024-12-13 05:07:00,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4849664. Throughput: 0: 1117.8. Samples: 4851692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:00,373][03180] Avg episode reward: [(0, '4796.403')] [2024-12-13 05:07:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4853760. Throughput: 0: 1131.2. Samples: 4857564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:05,371][03180] Avg episode reward: [(0, '4815.283')] [2024-12-13 05:07:05,372][03213] Saving new best policy, reward=4815.283! [2024-12-13 05:07:10,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4861952. Throughput: 0: 1122.1. Samples: 4861008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:10,371][03180] Avg episode reward: [(0, '4747.387')] [2024-12-13 05:07:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009496_4861952.pth... [2024-12-13 05:07:10,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009432_4829184.pth [2024-12-13 05:07:15,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4870144. Throughput: 0: 1123.7. Samples: 4868944. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:07:15,371][03180] Avg episode reward: [(0, '4778.595')] [2024-12-13 05:07:19,419][03226] Updated weights for policy 0, policy_version 9520 (0.0017) [2024-12-13 05:07:20,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4874240. Throughput: 0: 1139.3. Samples: 4874776. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:07:20,374][03180] Avg episode reward: [(0, '4801.823')] [2024-12-13 05:07:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4878336. Throughput: 0: 1125.8. Samples: 4878032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:25,371][03180] Avg episode reward: [(0, '4800.975')] [2024-12-13 05:07:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009528_4878336.pth... [2024-12-13 05:07:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009464_4845568.pth [2024-12-13 05:07:30,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4886528. Throughput: 0: 1124.5. Samples: 4885740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:30,371][03180] Avg episode reward: [(0, '4803.676')] [2024-12-13 05:07:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4890624. Throughput: 0: 1145.4. Samples: 4891928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:35,371][03180] Avg episode reward: [(0, '4865.539')] [2024-12-13 05:07:35,372][03213] Saving new best policy, reward=4865.539! [2024-12-13 05:07:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4894720. Throughput: 0: 1122.5. Samples: 4894628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:40,371][03180] Avg episode reward: [(0, '4843.314')] [2024-12-13 05:07:40,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009560_4894720.pth... [2024-12-13 05:07:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009496_4861952.pth [2024-12-13 05:07:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 4902912. Throughput: 0: 1123.7. Samples: 4902256. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:45,372][03180] Avg episode reward: [(0, '4832.076')] [2024-12-13 05:07:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4907008. Throughput: 0: 1135.6. Samples: 4908664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:07:50,371][03180] Avg episode reward: [(0, '4881.402')] [2024-12-13 05:07:50,372][03213] Saving new best policy, reward=4881.402! [2024-12-13 05:07:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4911104. Throughput: 0: 1114.7. Samples: 4911168. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:07:55,371][03180] Avg episode reward: [(0, '4814.496')] [2024-12-13 05:07:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009592_4911104.pth... [2024-12-13 05:07:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009528_4878336.pth [2024-12-13 05:07:56,257][03226] Updated weights for policy 0, policy_version 9600 (0.0009) [2024-12-13 05:08:00,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 4919296. Throughput: 0: 1110.1. Samples: 4918900. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:08:00,372][03180] Avg episode reward: [(0, '4811.194')] [2024-12-13 05:08:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 4923392. Throughput: 0: 1132.2. Samples: 4925720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:08:05,371][03180] Avg episode reward: [(0, '4825.426')] [2024-12-13 05:08:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4927488. Throughput: 0: 1115.1. Samples: 4928212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:08:10,371][03180] Avg episode reward: [(0, '4806.419')] [2024-12-13 05:08:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009624_4927488.pth... [2024-12-13 05:08:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009560_4894720.pth [2024-12-13 05:08:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 4935680. Throughput: 0: 1108.6. Samples: 4935628. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:08:15,371][03180] Avg episode reward: [(0, '4841.101')] [2024-12-13 05:08:20,376][03180] Fps is (10 sec: 1228.1, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 4939776. Throughput: 0: 1132.2. Samples: 4942884. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:08:20,377][03180] Avg episode reward: [(0, '4818.139')] [2024-12-13 05:08:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4943872. Throughput: 0: 1128.4. Samples: 4945408. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:08:25,371][03180] Avg episode reward: [(0, '4801.846')] [2024-12-13 05:08:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009656_4943872.pth... [2024-12-13 05:08:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009592_4911104.pth [2024-12-13 05:08:30,371][03180] Fps is (10 sec: 819.7, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 4947968. Throughput: 0: 1070.1. Samples: 4950412. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:08:30,371][03180] Avg episode reward: [(0, '4778.741')] [2024-12-13 05:08:33,883][03226] Updated weights for policy 0, policy_version 9680 (0.0009) [2024-12-13 05:08:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4956160. Throughput: 0: 1088.4. Samples: 4957640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:08:35,371][03180] Avg episode reward: [(0, '4789.845')] [2024-12-13 05:08:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4960256. Throughput: 0: 1089.3. Samples: 4960188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:08:40,375][03180] Avg episode reward: [(0, '4725.294')] [2024-12-13 05:08:40,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009688_4960256.pth... [2024-12-13 05:08:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009624_4927488.pth [2024-12-13 05:08:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4968448. Throughput: 0: 1063.6. Samples: 4966760. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:08:45,371][03180] Avg episode reward: [(0, '4709.572')] [2024-12-13 05:08:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4972544. Throughput: 0: 1083.2. Samples: 4974464. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:08:50,371][03180] Avg episode reward: [(0, '4687.272')] [2024-12-13 05:08:55,375][03180] Fps is (10 sec: 818.8, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 4976640. Throughput: 0: 1090.9. Samples: 4977308. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:08:55,376][03180] Avg episode reward: [(0, '4705.400')] [2024-12-13 05:08:55,387][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009720_4976640.pth... [2024-12-13 05:08:55,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009656_4943872.pth [2024-12-13 05:09:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4984832. Throughput: 0: 1067.3. Samples: 4983656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:09:00,371][03180] Avg episode reward: [(0, '4702.116')] [2024-12-13 05:09:05,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 4988928. Throughput: 0: 1077.5. Samples: 4991364. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:09:05,371][03180] Avg episode reward: [(0, '4722.786')] [2024-12-13 05:09:10,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 4993024. Throughput: 0: 1090.1. Samples: 4994464. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:09:10,374][03180] Avg episode reward: [(0, '4681.654')] [2024-12-13 05:09:10,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009752_4993024.pth... [2024-12-13 05:09:10,395][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009688_4960256.pth [2024-12-13 05:09:11,258][03226] Updated weights for policy 0, policy_version 9760 (0.0010) [2024-12-13 05:09:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5001216. Throughput: 0: 1111.6. Samples: 5000436. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:09:15,371][03180] Avg episode reward: [(0, '4605.642')] [2024-12-13 05:09:20,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 5005312. Throughput: 0: 1123.4. Samples: 5008192. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:09:20,371][03180] Avg episode reward: [(0, '4524.972')] [2024-12-13 05:09:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5013504. Throughput: 0: 1144.4. Samples: 5011684. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:09:25,372][03180] Avg episode reward: [(0, '4510.960')] [2024-12-13 05:09:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009792_5013504.pth... [2024-12-13 05:09:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009720_4976640.pth [2024-12-13 05:09:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5017600. Throughput: 0: 1124.5. Samples: 5017364. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:09:30,371][03180] Avg episode reward: [(0, '4456.917')] [2024-12-13 05:09:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5021696. Throughput: 0: 1129.1. Samples: 5025272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:09:35,371][03180] Avg episode reward: [(0, '4412.605')] [2024-12-13 05:09:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5029888. Throughput: 0: 1153.3. Samples: 5029200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:09:40,371][03180] Avg episode reward: [(0, '4384.989')] [2024-12-13 05:09:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009824_5029888.pth... [2024-12-13 05:09:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009752_4993024.pth [2024-12-13 05:09:45,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5033984. Throughput: 0: 1127.6. Samples: 5034400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:09:45,372][03180] Avg episode reward: [(0, '4376.396')] [2024-12-13 05:09:46,910][03226] Updated weights for policy 0, policy_version 9840 (0.0010) [2024-12-13 05:09:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5042176. Throughput: 0: 1129.2. Samples: 5042176. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:09:50,371][03180] Avg episode reward: [(0, '4440.495')] [2024-12-13 05:09:55,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 5046272. Throughput: 0: 1146.7. Samples: 5046064. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:09:55,371][03180] Avg episode reward: [(0, '4479.066')] [2024-12-13 05:09:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009856_5046272.pth... [2024-12-13 05:09:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009792_5013504.pth [2024-12-13 05:10:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5050368. Throughput: 0: 1133.4. Samples: 5051440. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:10:00,372][03180] Avg episode reward: [(0, '4462.674')] [2024-12-13 05:10:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5058560. Throughput: 0: 1127.3. Samples: 5058920. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:10:05,371][03180] Avg episode reward: [(0, '4425.286')] [2024-12-13 05:10:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 5062656. Throughput: 0: 1134.8. Samples: 5062752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:10:10,371][03180] Avg episode reward: [(0, '4356.247')] [2024-12-13 05:10:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009888_5062656.pth... [2024-12-13 05:10:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009824_5029888.pth [2024-12-13 05:10:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5066752. Throughput: 0: 1136.2. Samples: 5068492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:10:15,371][03180] Avg episode reward: [(0, '4363.404')] [2024-12-13 05:10:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5074944. Throughput: 0: 1121.4. Samples: 5075736. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:10:20,371][03180] Avg episode reward: [(0, '4341.696')] [2024-12-13 05:10:22,525][03226] Updated weights for policy 0, policy_version 9920 (0.0010) [2024-12-13 05:10:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5079040. Throughput: 0: 1122.0. Samples: 5079692. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:10:25,371][03180] Avg episode reward: [(0, '4343.513')] [2024-12-13 05:10:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009920_5079040.pth... [2024-12-13 05:10:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009856_5046272.pth [2024-12-13 05:10:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5083136. Throughput: 0: 1142.9. Samples: 5085828. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:10:30,371][03180] Avg episode reward: [(0, '4358.601')] [2024-12-13 05:10:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5091328. Throughput: 0: 1125.8. Samples: 5092836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:10:35,371][03180] Avg episode reward: [(0, '4473.724')] [2024-12-13 05:10:40,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5099520. Throughput: 0: 1128.1. Samples: 5096828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:10:40,371][03180] Avg episode reward: [(0, '4618.301')] [2024-12-13 05:10:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009960_5099520.pth... [2024-12-13 05:10:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009888_5062656.pth [2024-12-13 05:10:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5103616. Throughput: 0: 1150.7. Samples: 5103220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:10:45,371][03180] Avg episode reward: [(0, '4681.385')] [2024-12-13 05:10:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5107712. Throughput: 0: 1130.0. Samples: 5109772. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:10:50,371][03180] Avg episode reward: [(0, '4668.247')] [2024-12-13 05:10:55,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5115904. Throughput: 0: 1131.4. Samples: 5113668. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:10:55,374][03180] Avg episode reward: [(0, '4701.155')] [2024-12-13 05:10:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000009992_5115904.pth... [2024-12-13 05:10:55,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009920_5079040.pth [2024-12-13 05:10:58,653][03226] Updated weights for policy 0, policy_version 10000 (0.0009) [2024-12-13 05:11:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5120000. Throughput: 0: 1152.3. Samples: 5120344. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:11:00,374][03180] Avg episode reward: [(0, '4707.530')] [2024-12-13 05:11:05,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5124096. Throughput: 0: 1134.4. Samples: 5126784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:11:05,371][03180] Avg episode reward: [(0, '4773.619')] [2024-12-13 05:11:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5132288. Throughput: 0: 1131.7. Samples: 5130620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:11:10,371][03180] Avg episode reward: [(0, '4840.404')] [2024-12-13 05:11:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010024_5132288.pth... [2024-12-13 05:11:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009960_5099520.pth [2024-12-13 05:11:15,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5136384. Throughput: 0: 1149.7. Samples: 5137564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:11:15,372][03180] Avg episode reward: [(0, '4784.095')] [2024-12-13 05:11:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5140480. Throughput: 0: 1131.2. Samples: 5143740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:11:20,371][03180] Avg episode reward: [(0, '4759.449')] [2024-12-13 05:11:25,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5148672. Throughput: 0: 1129.8. Samples: 5147668. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:11:25,371][03180] Avg episode reward: [(0, '4735.754')] [2024-12-13 05:11:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010056_5148672.pth... [2024-12-13 05:11:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000009992_5115904.pth [2024-12-13 05:11:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5152768. Throughput: 0: 1150.2. Samples: 5154980. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:11:30,372][03180] Avg episode reward: [(0, '4816.477')] [2024-12-13 05:11:35,223][03226] Updated weights for policy 0, policy_version 10080 (0.0009) [2024-12-13 05:11:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5160960. Throughput: 0: 1134.0. Samples: 5160804. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:11:35,371][03180] Avg episode reward: [(0, '4902.069')] [2024-12-13 05:11:35,372][03213] Saving new best policy, reward=4902.069! [2024-12-13 05:11:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5165056. Throughput: 0: 1135.4. Samples: 5164756. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:11:40,371][03180] Avg episode reward: [(0, '4933.392')] [2024-12-13 05:11:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010088_5165056.pth... [2024-12-13 05:11:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010024_5132288.pth [2024-12-13 05:11:40,384][03213] Saving new best policy, reward=4933.392! [2024-12-13 05:11:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 5173248. Throughput: 0: 1153.7. Samples: 5172260. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:11:45,371][03180] Avg episode reward: [(0, '4906.091')] [2024-12-13 05:11:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5177344. Throughput: 0: 1131.3. Samples: 5177692. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:11:50,371][03180] Avg episode reward: [(0, '4913.535')] [2024-12-13 05:11:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5181440. Throughput: 0: 1133.2. Samples: 5181612. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:11:55,371][03180] Avg episode reward: [(0, '4916.992')] [2024-12-13 05:11:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010120_5181440.pth... [2024-12-13 05:11:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010056_5148672.pth [2024-12-13 05:12:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 5189632. Throughput: 0: 1155.0. Samples: 5189540. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:12:00,374][03180] Avg episode reward: [(0, '4917.471')] [2024-12-13 05:12:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5193728. Throughput: 0: 1133.1. Samples: 5194728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:12:05,371][03180] Avg episode reward: [(0, '4878.852')] [2024-12-13 05:12:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5197824. Throughput: 0: 1134.2. Samples: 5198708. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:12:10,371][03180] Avg episode reward: [(0, '4881.032')] [2024-12-13 05:12:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010152_5197824.pth... [2024-12-13 05:12:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010088_5165056.pth [2024-12-13 05:12:10,545][03226] Updated weights for policy 0, policy_version 10160 (0.0010) [2024-12-13 05:12:15,374][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5206016. Throughput: 0: 1144.0. Samples: 5206460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:12:15,375][03180] Avg episode reward: [(0, '4879.743')] [2024-12-13 05:12:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5210112. Throughput: 0: 1136.7. Samples: 5211956. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:12:20,371][03180] Avg episode reward: [(0, '4882.648')] [2024-12-13 05:12:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5214208. Throughput: 0: 1098.9. Samples: 5214208. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:12:25,372][03180] Avg episode reward: [(0, '4898.909')] [2024-12-13 05:12:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010184_5214208.pth... [2024-12-13 05:12:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010120_5181440.pth [2024-12-13 05:12:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5222400. Throughput: 0: 1082.4. Samples: 5220968. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:12:30,372][03180] Avg episode reward: [(0, '4904.124')] [2024-12-13 05:12:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5226496. Throughput: 0: 1098.5. Samples: 5227124. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:12:35,371][03180] Avg episode reward: [(0, '4889.662')] [2024-12-13 05:12:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5230592. Throughput: 0: 1078.8. Samples: 5230160. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:12:40,371][03180] Avg episode reward: [(0, '4971.320')] [2024-12-13 05:12:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010216_5230592.pth... [2024-12-13 05:12:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010152_5197824.pth [2024-12-13 05:12:40,383][03213] Saving new best policy, reward=4971.320! [2024-12-13 05:12:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5238784. Throughput: 0: 1076.8. Samples: 5237996. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:12:45,371][03180] Avg episode reward: [(0, '4993.405')] [2024-12-13 05:12:45,372][03213] Saving new best policy, reward=4993.405! [2024-12-13 05:12:48,442][03226] Updated weights for policy 0, policy_version 10240 (0.0013) [2024-12-13 05:12:50,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 5242880. Throughput: 0: 1102.5. Samples: 5244344. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:12:50,375][03180] Avg episode reward: [(0, '4993.690')] [2024-12-13 05:12:50,376][03213] Saving new best policy, reward=4993.690! [2024-12-13 05:12:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5246976. Throughput: 0: 1072.4. Samples: 5246964. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:12:55,371][03180] Avg episode reward: [(0, '4910.034')] [2024-12-13 05:12:55,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010248_5246976.pth... [2024-12-13 05:12:55,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010184_5214208.pth [2024-12-13 05:13:00,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5255168. Throughput: 0: 1076.0. Samples: 5254880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:13:00,373][03180] Avg episode reward: [(0, '4825.287')] [2024-12-13 05:13:05,372][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 5259264. Throughput: 0: 1102.7. Samples: 5261580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:13:05,372][03180] Avg episode reward: [(0, '4834.367')] [2024-12-13 05:13:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5263360. Throughput: 0: 1109.5. Samples: 5264136. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:13:10,373][03180] Avg episode reward: [(0, '4810.407')] [2024-12-13 05:13:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010280_5263360.pth... [2024-12-13 05:13:10,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010216_5230592.pth [2024-12-13 05:13:15,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5271552. Throughput: 0: 1124.1. Samples: 5271552. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:13:15,371][03180] Avg episode reward: [(0, '4787.037')] [2024-12-13 05:13:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5275648. Throughput: 0: 1147.6. Samples: 5278768. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:13:20,371][03180] Avg episode reward: [(0, '4803.540')] [2024-12-13 05:13:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5279744. Throughput: 0: 1134.7. Samples: 5281220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:13:25,372][03180] Avg episode reward: [(0, '4773.203')] [2024-12-13 05:13:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010312_5279744.pth... [2024-12-13 05:13:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010248_5246976.pth [2024-12-13 05:13:25,539][03226] Updated weights for policy 0, policy_version 10320 (0.0009) [2024-12-13 05:13:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 5287936. Throughput: 0: 1117.5. Samples: 5288288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:13:30,374][03180] Avg episode reward: [(0, '4776.331')] [2024-12-13 05:13:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5292032. Throughput: 0: 1144.3. Samples: 5295832. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:13:35,371][03180] Avg episode reward: [(0, '4734.347')] [2024-12-13 05:13:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5300224. Throughput: 0: 1141.7. Samples: 5298340. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:13:40,371][03180] Avg episode reward: [(0, '4734.074')] [2024-12-13 05:13:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010352_5300224.pth... [2024-12-13 05:13:40,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010280_5263360.pth [2024-12-13 05:13:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5304320. Throughput: 0: 1118.9. Samples: 5305232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:13:45,371][03180] Avg episode reward: [(0, '4674.569')] [2024-12-13 05:13:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.6). Total num frames: 5312512. Throughput: 0: 1141.4. Samples: 5312940. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:13:50,371][03180] Avg episode reward: [(0, '4635.647')] [2024-12-13 05:13:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5316608. Throughput: 0: 1144.0. Samples: 5315616. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:13:55,371][03180] Avg episode reward: [(0, '4640.171')] [2024-12-13 05:13:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010384_5316608.pth... [2024-12-13 05:13:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010312_5279744.pth [2024-12-13 05:14:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5320704. Throughput: 0: 1125.7. Samples: 5322208. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:14:00,372][03180] Avg episode reward: [(0, '4623.929')] [2024-12-13 05:14:01,095][03226] Updated weights for policy 0, policy_version 10400 (0.0009) [2024-12-13 05:14:05,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1138.6). Total num frames: 5328896. Throughput: 0: 1136.1. Samples: 5329896. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:14:05,373][03180] Avg episode reward: [(0, '4607.868')] [2024-12-13 05:14:10,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5332992. Throughput: 0: 1148.8. Samples: 5332920. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:14:10,373][03180] Avg episode reward: [(0, '4613.659')] [2024-12-13 05:14:10,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010416_5332992.pth... [2024-12-13 05:14:10,401][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010352_5300224.pth [2024-12-13 05:14:15,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5337088. Throughput: 0: 1127.5. Samples: 5339024. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:14:15,371][03180] Avg episode reward: [(0, '4626.588')] [2024-12-13 05:14:20,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5345280. Throughput: 0: 1128.4. Samples: 5346608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:14:20,371][03180] Avg episode reward: [(0, '4595.846')] [2024-12-13 05:14:25,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5349376. Throughput: 0: 1144.1. Samples: 5349828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:14:25,375][03180] Avg episode reward: [(0, '4721.164')] [2024-12-13 05:14:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010448_5349376.pth... [2024-12-13 05:14:25,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010384_5316608.pth [2024-12-13 05:14:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5353472. Throughput: 0: 1123.9. Samples: 5355808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:14:30,371][03180] Avg episode reward: [(0, '4764.084')] [2024-12-13 05:14:35,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5361664. Throughput: 0: 1127.2. Samples: 5363664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:14:35,371][03180] Avg episode reward: [(0, '4816.651')] [2024-12-13 05:14:36,660][03226] Updated weights for policy 0, policy_version 10480 (0.0009) [2024-12-13 05:14:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5365760. Throughput: 0: 1145.0. Samples: 5367140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:14:40,371][03180] Avg episode reward: [(0, '4752.207')] [2024-12-13 05:14:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010480_5365760.pth... [2024-12-13 05:14:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010416_5332992.pth [2024-12-13 05:14:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5373952. Throughput: 0: 1122.8. Samples: 5372736. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:14:45,371][03180] Avg episode reward: [(0, '4753.006')] [2024-12-13 05:14:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5378048. Throughput: 0: 1122.9. Samples: 5380424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:14:50,371][03180] Avg episode reward: [(0, '4757.562')] [2024-12-13 05:14:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5382144. Throughput: 0: 1140.1. Samples: 5384224. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:14:55,371][03180] Avg episode reward: [(0, '4792.069')] [2024-12-13 05:14:55,391][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010520_5386240.pth... [2024-12-13 05:14:55,397][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010448_5349376.pth [2024-12-13 05:15:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5390336. Throughput: 0: 1123.4. Samples: 5389576. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:15:00,375][03180] Avg episode reward: [(0, '4819.561')] [2024-12-13 05:15:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5394432. Throughput: 0: 1123.6. Samples: 5397172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:15:05,371][03180] Avg episode reward: [(0, '4852.314')] [2024-12-13 05:15:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 5402624. Throughput: 0: 1137.7. Samples: 5401020. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:15:10,371][03180] Avg episode reward: [(0, '4852.314')] [2024-12-13 05:15:10,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010552_5402624.pth... [2024-12-13 05:15:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010480_5365760.pth [2024-12-13 05:15:14,503][03226] Updated weights for policy 0, policy_version 10560 (0.0010) [2024-12-13 05:15:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5406720. Throughput: 0: 1123.7. Samples: 5406376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:15:15,371][03180] Avg episode reward: [(0, '4873.406')] [2024-12-13 05:15:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5410816. Throughput: 0: 1111.6. Samples: 5413684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:15:20,371][03180] Avg episode reward: [(0, '4919.046')] [2024-12-13 05:15:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 5419008. Throughput: 0: 1121.3. Samples: 5417600. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:15:25,371][03180] Avg episode reward: [(0, '4937.735')] [2024-12-13 05:15:25,392][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010584_5419008.pth... [2024-12-13 05:15:25,397][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010520_5386240.pth [2024-12-13 05:15:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5423104. Throughput: 0: 1121.6. Samples: 5423208. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:15:30,371][03180] Avg episode reward: [(0, '5025.665')] [2024-12-13 05:15:30,372][03213] Saving new best policy, reward=5025.665! [2024-12-13 05:15:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5427200. Throughput: 0: 1115.6. Samples: 5430628. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:15:35,371][03180] Avg episode reward: [(0, '4976.060')] [2024-12-13 05:15:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5435392. Throughput: 0: 1117.8. Samples: 5434524. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:15:40,371][03180] Avg episode reward: [(0, '5040.625')] [2024-12-13 05:15:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010616_5435392.pth... [2024-12-13 05:15:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010552_5402624.pth [2024-12-13 05:15:40,387][03213] Saving new best policy, reward=5040.625! [2024-12-13 05:15:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5439488. Throughput: 0: 1129.2. Samples: 5440388. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:15:45,372][03180] Avg episode reward: [(0, '5041.661')] [2024-12-13 05:15:45,374][03213] Saving new best policy, reward=5041.661! [2024-12-13 05:15:50,329][03226] Updated weights for policy 0, policy_version 10640 (0.0009) [2024-12-13 05:15:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5447680. Throughput: 0: 1115.6. Samples: 5447376. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:15:50,371][03180] Avg episode reward: [(0, '5100.508')] [2024-12-13 05:15:50,372][03213] Saving new best policy, reward=5100.508! [2024-12-13 05:15:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5451776. Throughput: 0: 1118.2. Samples: 5451340. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:15:55,371][03180] Avg episode reward: [(0, '5101.617')] [2024-12-13 05:15:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010648_5451776.pth... [2024-12-13 05:15:55,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010584_5419008.pth [2024-12-13 05:15:55,392][03213] Saving new best policy, reward=5101.617! [2024-12-13 05:16:00,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 5455872. Throughput: 0: 1137.9. Samples: 5457584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:16:00,383][03180] Avg episode reward: [(0, '5094.979')] [2024-12-13 05:16:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5464064. Throughput: 0: 1121.0. Samples: 5464128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:16:05,371][03180] Avg episode reward: [(0, '5049.148')] [2024-12-13 05:16:10,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5468160. Throughput: 0: 1122.1. Samples: 5468096. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:16:10,375][03180] Avg episode reward: [(0, '5086.084')] [2024-12-13 05:16:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010680_5468160.pth... [2024-12-13 05:16:10,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010616_5435392.pth [2024-12-13 05:16:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5472256. Throughput: 0: 1143.9. Samples: 5474684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:16:15,373][03180] Avg episode reward: [(0, '5118.445')] [2024-12-13 05:16:15,373][03213] Saving new best policy, reward=5118.445! [2024-12-13 05:16:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5476352. Throughput: 0: 1078.5. Samples: 5479160. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:16:20,371][03180] Avg episode reward: [(0, '5093.407')] [2024-12-13 05:16:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5484544. Throughput: 0: 1056.7. Samples: 5482076. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:16:25,371][03180] Avg episode reward: [(0, '5045.954')] [2024-12-13 05:16:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010712_5484544.pth... [2024-12-13 05:16:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010648_5451776.pth [2024-12-13 05:16:28,219][03226] Updated weights for policy 0, policy_version 10720 (0.0009) [2024-12-13 05:16:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5488640. Throughput: 0: 1091.5. Samples: 5489504. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:16:30,373][03180] Avg episode reward: [(0, '5043.598')] [2024-12-13 05:16:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5492736. Throughput: 0: 1057.2. Samples: 5494952. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:16:35,374][03180] Avg episode reward: [(0, '5063.792')] [2024-12-13 05:16:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5500928. Throughput: 0: 1052.5. Samples: 5498704. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:16:40,371][03180] Avg episode reward: [(0, '5077.869')] [2024-12-13 05:16:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010744_5500928.pth... [2024-12-13 05:16:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010680_5468160.pth [2024-12-13 05:16:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5505024. Throughput: 0: 1087.3. Samples: 5506512. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:16:45,371][03180] Avg episode reward: [(0, '5079.359')] [2024-12-13 05:16:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 5509120. Throughput: 0: 1055.0. Samples: 5511604. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:16:50,371][03180] Avg episode reward: [(0, '5028.347')] [2024-12-13 05:16:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5517312. Throughput: 0: 1052.2. Samples: 5515444. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:16:55,371][03180] Avg episode reward: [(0, '4987.197')] [2024-12-13 05:16:55,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010776_5517312.pth... [2024-12-13 05:16:55,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010712_5484544.pth [2024-12-13 05:17:00,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5521408. Throughput: 0: 1081.7. Samples: 5523364. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:00,374][03180] Avg episode reward: [(0, '5035.233')] [2024-12-13 05:17:05,371][03180] Fps is (10 sec: 819.1, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 5525504. Throughput: 0: 1104.2. Samples: 5528848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:05,372][03180] Avg episode reward: [(0, '4993.781')] [2024-12-13 05:17:05,854][03226] Updated weights for policy 0, policy_version 10800 (0.0012) [2024-12-13 05:17:10,372][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 5533696. Throughput: 0: 1120.2. Samples: 5532484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:10,372][03180] Avg episode reward: [(0, '4934.180')] [2024-12-13 05:17:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010808_5533696.pth... [2024-12-13 05:17:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010744_5500928.pth [2024-12-13 05:17:15,372][03180] Fps is (10 sec: 1638.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5541888. Throughput: 0: 1128.6. Samples: 5540292. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:17:15,373][03180] Avg episode reward: [(0, '4899.337')] [2024-12-13 05:17:20,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5545984. Throughput: 0: 1134.0. Samples: 5545984. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:17:20,371][03180] Avg episode reward: [(0, '4899.000')] [2024-12-13 05:17:25,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5550080. Throughput: 0: 1127.7. Samples: 5549452. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:25,371][03180] Avg episode reward: [(0, '4874.429')] [2024-12-13 05:17:25,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010840_5550080.pth... [2024-12-13 05:17:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010776_5517312.pth [2024-12-13 05:17:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5558272. Throughput: 0: 1125.4. Samples: 5557156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:30,371][03180] Avg episode reward: [(0, '4961.182')] [2024-12-13 05:17:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5562368. Throughput: 0: 1147.7. Samples: 5563252. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:35,371][03180] Avg episode reward: [(0, '4921.037')] [2024-12-13 05:17:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5566464. Throughput: 0: 1133.2. Samples: 5566440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:40,371][03180] Avg episode reward: [(0, '4872.129')] [2024-12-13 05:17:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010872_5566464.pth... [2024-12-13 05:17:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010808_5533696.pth [2024-12-13 05:17:41,368][03226] Updated weights for policy 0, policy_version 10880 (0.0011) [2024-12-13 05:17:45,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5574656. Throughput: 0: 1126.2. Samples: 5574044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:45,374][03180] Avg episode reward: [(0, '4887.064')] [2024-12-13 05:17:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5578752. Throughput: 0: 1141.9. Samples: 5580232. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:17:50,371][03180] Avg episode reward: [(0, '4806.020')] [2024-12-13 05:17:55,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5582848. Throughput: 0: 1120.2. Samples: 5582892. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:17:55,371][03180] Avg episode reward: [(0, '4827.755')] [2024-12-13 05:17:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010904_5582848.pth... [2024-12-13 05:17:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010840_5550080.pth [2024-12-13 05:18:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 5591040. Throughput: 0: 1116.0. Samples: 5590508. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:18:00,371][03180] Avg episode reward: [(0, '4782.205')] [2024-12-13 05:18:05,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5595136. Throughput: 0: 1135.0. Samples: 5597060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:18:05,373][03180] Avg episode reward: [(0, '4793.952')] [2024-12-13 05:18:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5599232. Throughput: 0: 1113.5. Samples: 5599560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:18:10,371][03180] Avg episode reward: [(0, '4740.480')] [2024-12-13 05:18:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010936_5599232.pth... [2024-12-13 05:18:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010872_5566464.pth [2024-12-13 05:18:15,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5607424. Throughput: 0: 1110.8. Samples: 5607140. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:18:15,371][03180] Avg episode reward: [(0, '4686.630')] [2024-12-13 05:18:17,478][03226] Updated weights for policy 0, policy_version 10960 (0.0018) [2024-12-13 05:18:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5611520. Throughput: 0: 1128.7. Samples: 5614044. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:18:20,371][03180] Avg episode reward: [(0, '4703.701')] [2024-12-13 05:18:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5615616. Throughput: 0: 1113.7. Samples: 5616556. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:18:25,371][03180] Avg episode reward: [(0, '4690.852')] [2024-12-13 05:18:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000010968_5615616.pth... [2024-12-13 05:18:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010904_5582848.pth [2024-12-13 05:18:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5623808. Throughput: 0: 1101.0. Samples: 5623584. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:18:30,371][03180] Avg episode reward: [(0, '4690.851')] [2024-12-13 05:18:35,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 5627904. Throughput: 0: 1123.0. Samples: 5630768. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:18:35,373][03180] Avg episode reward: [(0, '4697.229')] [2024-12-13 05:18:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5632000. Throughput: 0: 1117.2. Samples: 5633164. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:18:40,371][03180] Avg episode reward: [(0, '4698.378')] [2024-12-13 05:18:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011000_5632000.pth... [2024-12-13 05:18:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010936_5599232.pth [2024-12-13 05:18:45,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5640192. Throughput: 0: 1099.3. Samples: 5639976. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:18:45,371][03180] Avg episode reward: [(0, '4653.436')] [2024-12-13 05:18:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5644288. Throughput: 0: 1119.7. Samples: 5647444. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:18:50,371][03180] Avg episode reward: [(0, '4611.724')] [2024-12-13 05:18:55,374][03180] Fps is (10 sec: 818.9, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 5648384. Throughput: 0: 1116.5. Samples: 5649808. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:18:55,375][03180] Avg episode reward: [(0, '4564.774')] [2024-12-13 05:18:55,391][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011032_5648384.pth... [2024-12-13 05:18:55,399][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000010968_5615616.pth [2024-12-13 05:18:55,954][03226] Updated weights for policy 0, policy_version 11040 (0.0010) [2024-12-13 05:19:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5656576. Throughput: 0: 1094.9. Samples: 5656412. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:19:00,371][03180] Avg episode reward: [(0, '4602.924')] [2024-12-13 05:19:05,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5660672. Throughput: 0: 1115.2. Samples: 5664228. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:19:05,371][03180] Avg episode reward: [(0, '4603.863')] [2024-12-13 05:19:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5664768. Throughput: 0: 1117.6. Samples: 5666848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:19:10,371][03180] Avg episode reward: [(0, '4605.738')] [2024-12-13 05:19:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011064_5664768.pth... [2024-12-13 05:19:10,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011000_5632000.pth [2024-12-13 05:19:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5672960. Throughput: 0: 1102.1. Samples: 5673180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:19:15,372][03180] Avg episode reward: [(0, '4688.594')] [2024-12-13 05:19:20,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5681152. Throughput: 0: 1118.6. Samples: 5681104. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:19:20,371][03180] Avg episode reward: [(0, '4639.163')] [2024-12-13 05:19:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5685248. Throughput: 0: 1132.6. Samples: 5684132. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:19:25,374][03180] Avg episode reward: [(0, '4637.553')] [2024-12-13 05:19:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011104_5685248.pth... [2024-12-13 05:19:25,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011032_5648384.pth [2024-12-13 05:19:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5689344. Throughput: 0: 1115.3. Samples: 5690164. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:19:30,371][03180] Avg episode reward: [(0, '4690.931')] [2024-12-13 05:19:31,579][03226] Updated weights for policy 0, policy_version 11120 (0.0010) [2024-12-13 05:19:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 5697536. Throughput: 0: 1121.3. Samples: 5697904. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:19:35,371][03180] Avg episode reward: [(0, '4751.002')] [2024-12-13 05:19:40,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 5701632. Throughput: 0: 1145.7. Samples: 5701364. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:19:40,375][03180] Avg episode reward: [(0, '4769.835')] [2024-12-13 05:19:40,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011136_5701632.pth... [2024-12-13 05:19:40,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011064_5664768.pth [2024-12-13 05:19:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5705728. Throughput: 0: 1129.3. Samples: 5707232. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:19:45,371][03180] Avg episode reward: [(0, '4753.318')] [2024-12-13 05:19:50,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5713920. Throughput: 0: 1126.7. Samples: 5714928. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:19:50,371][03180] Avg episode reward: [(0, '4700.359')] [2024-12-13 05:19:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 5718016. Throughput: 0: 1148.5. Samples: 5718532. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:19:55,371][03180] Avg episode reward: [(0, '4701.999')] [2024-12-13 05:19:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011168_5718016.pth... [2024-12-13 05:19:55,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011104_5685248.pth [2024-12-13 05:20:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5722112. Throughput: 0: 1130.5. Samples: 5724052. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:20:00,371][03180] Avg episode reward: [(0, '4690.257')] [2024-12-13 05:20:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 5730304. Throughput: 0: 1124.5. Samples: 5731708. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:20:05,371][03180] Avg episode reward: [(0, '4678.368')] [2024-12-13 05:20:07,174][03226] Updated weights for policy 0, policy_version 11200 (0.0009) [2024-12-13 05:20:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 5734400. Throughput: 0: 1141.2. Samples: 5735484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:20:10,371][03180] Avg episode reward: [(0, '4728.275')] [2024-12-13 05:20:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011200_5734400.pth... [2024-12-13 05:20:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011136_5701632.pth [2024-12-13 05:20:15,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 5738496. Throughput: 0: 1108.2. Samples: 5740036. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:20:15,374][03180] Avg episode reward: [(0, '4725.676')] [2024-12-13 05:20:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 5742592. Throughput: 0: 1069.0. Samples: 5746008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:20:20,371][03180] Avg episode reward: [(0, '4755.864')] [2024-12-13 05:20:25,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5750784. Throughput: 0: 1077.5. Samples: 5749848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:20:25,371][03180] Avg episode reward: [(0, '4751.527')] [2024-12-13 05:20:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011232_5750784.pth... [2024-12-13 05:20:25,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011168_5718016.pth [2024-12-13 05:20:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 5754880. Throughput: 0: 1083.1. Samples: 5755976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:20:30,373][03180] Avg episode reward: [(0, '4751.224')] [2024-12-13 05:20:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5763072. Throughput: 0: 1064.9. Samples: 5762848. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:20:35,374][03180] Avg episode reward: [(0, '4727.954')] [2024-12-13 05:20:40,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5767168. Throughput: 0: 1072.2. Samples: 5766780. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:20:40,371][03180] Avg episode reward: [(0, '4742.456')] [2024-12-13 05:20:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011264_5767168.pth... [2024-12-13 05:20:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011200_5734400.pth [2024-12-13 05:20:45,371][03180] Fps is (10 sec: 819.1, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 5771264. Throughput: 0: 1096.4. Samples: 5773392. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:20:45,372][03180] Avg episode reward: [(0, '4788.503')] [2024-12-13 05:20:46,465][03226] Updated weights for policy 0, policy_version 11280 (0.0009) [2024-12-13 05:20:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5779456. Throughput: 0: 1070.8. Samples: 5779892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:20:50,371][03180] Avg episode reward: [(0, '4796.312')] [2024-12-13 05:20:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5783552. Throughput: 0: 1072.6. Samples: 5783752. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:20:55,371][03180] Avg episode reward: [(0, '4778.542')] [2024-12-13 05:20:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011296_5783552.pth... [2024-12-13 05:20:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011232_5750784.pth [2024-12-13 05:21:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 5787648. Throughput: 0: 1125.3. Samples: 5790672. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:21:00,372][03180] Avg episode reward: [(0, '4790.396')] [2024-12-13 05:21:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5795840. Throughput: 0: 1127.7. Samples: 5796756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:21:05,371][03180] Avg episode reward: [(0, '4858.242')] [2024-12-13 05:21:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5799936. Throughput: 0: 1126.1. Samples: 5800524. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:21:10,372][03180] Avg episode reward: [(0, '4931.200')] [2024-12-13 05:21:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011328_5799936.pth... [2024-12-13 05:21:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011264_5767168.pth [2024-12-13 05:21:15,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5808128. Throughput: 0: 1152.4. Samples: 5807836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:21:15,374][03180] Avg episode reward: [(0, '4918.309')] [2024-12-13 05:21:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 5812224. Throughput: 0: 1126.4. Samples: 5813536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:21:20,371][03180] Avg episode reward: [(0, '4933.467')] [2024-12-13 05:21:22,228][03226] Updated weights for policy 0, policy_version 11360 (0.0011) [2024-12-13 05:21:25,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5816320. Throughput: 0: 1124.2. Samples: 5817368. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:21:25,371][03180] Avg episode reward: [(0, '4992.260')] [2024-12-13 05:21:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011368_5820416.pth... [2024-12-13 05:21:25,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011296_5783552.pth [2024-12-13 05:21:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 5824512. Throughput: 0: 1145.0. Samples: 5824916. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:21:30,371][03180] Avg episode reward: [(0, '4904.678')] [2024-12-13 05:21:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5828608. Throughput: 0: 1122.6. Samples: 5830408. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:21:35,371][03180] Avg episode reward: [(0, '4861.022')] [2024-12-13 05:21:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5836800. Throughput: 0: 1122.0. Samples: 5834244. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:21:40,371][03180] Avg episode reward: [(0, '4757.252')] [2024-12-13 05:21:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011400_5836800.pth... [2024-12-13 05:21:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011328_5799936.pth [2024-12-13 05:21:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5840896. Throughput: 0: 1138.6. Samples: 5841908. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:21:45,388][03180] Avg episode reward: [(0, '4749.394')] [2024-12-13 05:21:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5844992. Throughput: 0: 1120.6. Samples: 5847184. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:21:50,371][03180] Avg episode reward: [(0, '4650.023')] [2024-12-13 05:21:55,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5853184. Throughput: 0: 1122.7. Samples: 5851048. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:21:55,373][03180] Avg episode reward: [(0, '4694.066')] [2024-12-13 05:21:55,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011432_5853184.pth... [2024-12-13 05:21:55,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011368_5820416.pth [2024-12-13 05:21:57,922][03226] Updated weights for policy 0, policy_version 11440 (0.0009) [2024-12-13 05:22:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5857280. Throughput: 0: 1130.8. Samples: 5858720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:22:00,371][03180] Avg episode reward: [(0, '4670.409')] [2024-12-13 05:22:05,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5861376. Throughput: 0: 1128.7. Samples: 5864328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:22:05,372][03180] Avg episode reward: [(0, '4617.433')] [2024-12-13 05:22:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 5869568. Throughput: 0: 1125.3. Samples: 5868004. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:22:10,371][03180] Avg episode reward: [(0, '4551.388')] [2024-12-13 05:22:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011464_5869568.pth... [2024-12-13 05:22:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011400_5836800.pth [2024-12-13 05:22:15,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5873664. Throughput: 0: 1128.8. Samples: 5875716. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:22:15,373][03180] Avg episode reward: [(0, '4566.856')] [2024-12-13 05:22:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5877760. Throughput: 0: 1133.6. Samples: 5881420. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:22:20,371][03180] Avg episode reward: [(0, '4544.980')] [2024-12-13 05:22:25,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 5885952. Throughput: 0: 1123.0. Samples: 5884780. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:22:25,371][03180] Avg episode reward: [(0, '4469.546')] [2024-12-13 05:22:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011496_5885952.pth... [2024-12-13 05:22:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011432_5853184.pth [2024-12-13 05:22:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5890048. Throughput: 0: 1123.6. Samples: 5892472. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:22:30,371][03180] Avg episode reward: [(0, '4423.469')] [2024-12-13 05:22:35,092][03226] Updated weights for policy 0, policy_version 11520 (0.0009) [2024-12-13 05:22:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5898240. Throughput: 0: 1134.6. Samples: 5898240. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:22:35,371][03180] Avg episode reward: [(0, '4490.560')] [2024-12-13 05:22:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5902336. Throughput: 0: 1117.4. Samples: 5901328. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:22:40,372][03180] Avg episode reward: [(0, '4539.149')] [2024-12-13 05:22:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011528_5902336.pth... [2024-12-13 05:22:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011464_5869568.pth [2024-12-13 05:22:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5910528. Throughput: 0: 1117.7. Samples: 5909016. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:22:45,371][03180] Avg episode reward: [(0, '4564.288')] [2024-12-13 05:22:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5914624. Throughput: 0: 1131.0. Samples: 5915224. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:22:50,371][03180] Avg episode reward: [(0, '4517.575')] [2024-12-13 05:22:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5918720. Throughput: 0: 1110.9. Samples: 5917996. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:22:55,371][03180] Avg episode reward: [(0, '4486.356')] [2024-12-13 05:22:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011560_5918720.pth... [2024-12-13 05:22:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011496_5885952.pth [2024-12-13 05:23:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5926912. Throughput: 0: 1111.1. Samples: 5925712. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:23:00,371][03180] Avg episode reward: [(0, '4508.930')] [2024-12-13 05:23:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5931008. Throughput: 0: 1130.8. Samples: 5932308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:23:05,371][03180] Avg episode reward: [(0, '4619.031')] [2024-12-13 05:23:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5935104. Throughput: 0: 1112.7. Samples: 5934852. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:23:10,371][03180] Avg episode reward: [(0, '4647.595')] [2024-12-13 05:23:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011592_5935104.pth... [2024-12-13 05:23:10,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011528_5902336.pth [2024-12-13 05:23:11,562][03226] Updated weights for policy 0, policy_version 11600 (0.0008) [2024-12-13 05:23:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 5943296. Throughput: 0: 1110.4. Samples: 5942440. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:23:15,372][03180] Avg episode reward: [(0, '4677.732')] [2024-12-13 05:23:20,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5947392. Throughput: 0: 1131.4. Samples: 5949156. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:23:20,374][03180] Avg episode reward: [(0, '4727.319')] [2024-12-13 05:23:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5951488. Throughput: 0: 1114.7. Samples: 5951488. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:23:25,371][03180] Avg episode reward: [(0, '4737.339')] [2024-12-13 05:23:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011624_5951488.pth... [2024-12-13 05:23:25,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011560_5918720.pth [2024-12-13 05:23:30,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 5959680. Throughput: 0: 1102.8. Samples: 5958644. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:23:30,371][03180] Avg episode reward: [(0, '4799.475')] [2024-12-13 05:23:35,376][03180] Fps is (10 sec: 1228.1, 60 sec: 1092.2, 300 sec: 1124.6). Total num frames: 5963776. Throughput: 0: 1125.9. Samples: 5965896. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:23:35,377][03180] Avg episode reward: [(0, '4784.502')] [2024-12-13 05:23:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5967872. Throughput: 0: 1119.5. Samples: 5968372. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:23:40,371][03180] Avg episode reward: [(0, '4786.047')] [2024-12-13 05:23:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011656_5967872.pth... [2024-12-13 05:23:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011592_5935104.pth [2024-12-13 05:23:45,371][03180] Fps is (10 sec: 1229.5, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5976064. Throughput: 0: 1101.5. Samples: 5975280. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:23:45,371][03180] Avg episode reward: [(0, '4816.121')] [2024-12-13 05:23:48,062][03226] Updated weights for policy 0, policy_version 11680 (0.0009) [2024-12-13 05:23:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5980160. Throughput: 0: 1119.7. Samples: 5982696. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:23:50,375][03180] Avg episode reward: [(0, '4793.428')] [2024-12-13 05:23:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 5984256. Throughput: 0: 1112.8. Samples: 5984928. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:23:55,372][03180] Avg episode reward: [(0, '4754.121')] [2024-12-13 05:23:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011688_5984256.pth... [2024-12-13 05:23:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011624_5951488.pth [2024-12-13 05:24:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5992448. Throughput: 0: 1085.0. Samples: 5991264. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:24:00,371][03180] Avg episode reward: [(0, '4760.492')] [2024-12-13 05:24:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 5996544. Throughput: 0: 1102.7. Samples: 5998776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:24:05,371][03180] Avg episode reward: [(0, '4759.089')] [2024-12-13 05:24:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6000640. Throughput: 0: 1113.4. Samples: 6001592. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:24:10,371][03180] Avg episode reward: [(0, '4755.905')] [2024-12-13 05:24:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011720_6000640.pth... [2024-12-13 05:24:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011656_5967872.pth [2024-12-13 05:24:15,373][03180] Fps is (10 sec: 819.0, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 6004736. Throughput: 0: 1048.8. Samples: 6005844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:24:15,374][03180] Avg episode reward: [(0, '4802.668')] [2024-12-13 05:24:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6012928. Throughput: 0: 1042.4. Samples: 6012796. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:24:20,371][03180] Avg episode reward: [(0, '4764.202')] [2024-12-13 05:24:25,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6017024. Throughput: 0: 1071.6. Samples: 6016596. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:24:25,371][03180] Avg episode reward: [(0, '4899.536')] [2024-12-13 05:24:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011752_6017024.pth... [2024-12-13 05:24:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011688_5984256.pth [2024-12-13 05:24:28,744][03226] Updated weights for policy 0, policy_version 11760 (0.0011) [2024-12-13 05:24:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 6021120. Throughput: 0: 1025.9. Samples: 6021444. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:24:30,371][03180] Avg episode reward: [(0, '4912.159')] [2024-12-13 05:24:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 6029312. Throughput: 0: 1030.7. Samples: 6029076. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:24:35,371][03180] Avg episode reward: [(0, '4893.033')] [2024-12-13 05:24:40,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 6033408. Throughput: 0: 1064.8. Samples: 6032848. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:24:40,376][03180] Avg episode reward: [(0, '4842.187')] [2024-12-13 05:24:40,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011784_6033408.pth... [2024-12-13 05:24:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011720_6000640.pth [2024-12-13 05:24:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 6037504. Throughput: 0: 1039.4. Samples: 6038036. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:24:45,371][03180] Avg episode reward: [(0, '4842.412')] [2024-12-13 05:24:50,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6045696. Throughput: 0: 1037.5. Samples: 6045464. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:24:50,371][03180] Avg episode reward: [(0, '4882.763')] [2024-12-13 05:24:55,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 6049792. Throughput: 0: 1058.4. Samples: 6049224. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:24:55,373][03180] Avg episode reward: [(0, '4857.769')] [2024-12-13 05:24:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011816_6049792.pth... [2024-12-13 05:24:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011752_6017024.pth [2024-12-13 05:25:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 6053888. Throughput: 0: 1088.8. Samples: 6054836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:00,371][03180] Avg episode reward: [(0, '4852.100')] [2024-12-13 05:25:05,121][03226] Updated weights for policy 0, policy_version 11840 (0.0010) [2024-12-13 05:25:05,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6062080. Throughput: 0: 1094.6. Samples: 6062052. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:05,371][03180] Avg episode reward: [(0, '4884.990')] [2024-12-13 05:25:10,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6066176. Throughput: 0: 1096.6. Samples: 6065944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:10,372][03180] Avg episode reward: [(0, '4821.759')] [2024-12-13 05:25:10,385][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011848_6066176.pth... [2024-12-13 05:25:10,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011784_6033408.pth [2024-12-13 05:25:15,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6070272. Throughput: 0: 1122.3. Samples: 6071952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:15,374][03180] Avg episode reward: [(0, '4860.721')] [2024-12-13 05:25:20,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6078464. Throughput: 0: 1102.9. Samples: 6078708. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:20,371][03180] Avg episode reward: [(0, '4930.356')] [2024-12-13 05:25:25,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6082560. Throughput: 0: 1104.8. Samples: 6082560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:25,371][03180] Avg episode reward: [(0, '4937.751')] [2024-12-13 05:25:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011880_6082560.pth... [2024-12-13 05:25:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011816_6049792.pth [2024-12-13 05:25:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 6086656. Throughput: 0: 1130.7. Samples: 6088916. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:30,371][03180] Avg episode reward: [(0, '4898.412')] [2024-12-13 05:25:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6094848. Throughput: 0: 1112.4. Samples: 6095524. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:25:35,371][03180] Avg episode reward: [(0, '4889.609')] [2024-12-13 05:25:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 6098944. Throughput: 0: 1114.9. Samples: 6099392. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:25:40,371][03180] Avg episode reward: [(0, '4860.416')] [2024-12-13 05:25:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011912_6098944.pth... [2024-12-13 05:25:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011848_6066176.pth [2024-12-13 05:25:40,802][03226] Updated weights for policy 0, policy_version 11920 (0.0016) [2024-12-13 05:25:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6107136. Throughput: 0: 1144.5. Samples: 6106340. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:45,372][03180] Avg episode reward: [(0, '4859.288')] [2024-12-13 05:25:50,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 6111232. Throughput: 0: 1124.0. Samples: 6112636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:25:50,373][03180] Avg episode reward: [(0, '4877.635')] [2024-12-13 05:25:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 6119424. Throughput: 0: 1123.6. Samples: 6116508. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:25:55,372][03180] Avg episode reward: [(0, '4843.212')] [2024-12-13 05:25:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011952_6119424.pth... [2024-12-13 05:25:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011880_6082560.pth [2024-12-13 05:26:00,375][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.4, 300 sec: 1110.8). Total num frames: 6123520. Throughput: 0: 1151.2. Samples: 6123760. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:26:00,376][03180] Avg episode reward: [(0, '4868.667')] [2024-12-13 05:26:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6127616. Throughput: 0: 1134.1. Samples: 6129744. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:26:05,371][03180] Avg episode reward: [(0, '4840.304')] [2024-12-13 05:26:10,371][03180] Fps is (10 sec: 1229.3, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6135808. Throughput: 0: 1135.3. Samples: 6133648. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:26:10,371][03180] Avg episode reward: [(0, '4843.721')] [2024-12-13 05:26:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000011984_6135808.pth... [2024-12-13 05:26:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011912_6098944.pth [2024-12-13 05:26:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 6139904. Throughput: 0: 1158.4. Samples: 6141044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:26:15,371][03180] Avg episode reward: [(0, '4794.694')] [2024-12-13 05:26:17,583][03226] Updated weights for policy 0, policy_version 12000 (0.0009) [2024-12-13 05:26:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6144000. Throughput: 0: 1134.7. Samples: 6146584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:26:20,371][03180] Avg episode reward: [(0, '4695.986')] [2024-12-13 05:26:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6152192. Throughput: 0: 1134.4. Samples: 6150440. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:26:25,371][03180] Avg episode reward: [(0, '4750.789')] [2024-12-13 05:26:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012016_6152192.pth... [2024-12-13 05:26:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011952_6119424.pth [2024-12-13 05:26:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6156288. Throughput: 0: 1153.4. Samples: 6158244. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:26:30,372][03180] Avg episode reward: [(0, '4723.910')] [2024-12-13 05:26:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 6160384. Throughput: 0: 1132.0. Samples: 6163576. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:26:35,371][03180] Avg episode reward: [(0, '4735.694')] [2024-12-13 05:26:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6168576. Throughput: 0: 1133.2. Samples: 6167500. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:26:40,371][03180] Avg episode reward: [(0, '4728.054')] [2024-12-13 05:26:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012048_6168576.pth... [2024-12-13 05:26:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000011984_6135808.pth [2024-12-13 05:26:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6172672. Throughput: 0: 1147.8. Samples: 6175408. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:26:45,371][03180] Avg episode reward: [(0, '4686.021')] [2024-12-13 05:26:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 6180864. Throughput: 0: 1135.0. Samples: 6180820. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:26:50,372][03180] Avg episode reward: [(0, '4690.988')] [2024-12-13 05:26:53,511][03226] Updated weights for policy 0, policy_version 12080 (0.0009) [2024-12-13 05:26:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6184960. Throughput: 0: 1129.9. Samples: 6184492. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:26:55,371][03180] Avg episode reward: [(0, '4734.187')] [2024-12-13 05:26:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012080_6184960.pth... [2024-12-13 05:26:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012016_6152192.pth [2024-12-13 05:27:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 6193152. Throughput: 0: 1133.8. Samples: 6192064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:27:00,371][03180] Avg episode reward: [(0, '4747.446')] [2024-12-13 05:27:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6197248. Throughput: 0: 1134.0. Samples: 6197612. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:27:05,374][03180] Avg episode reward: [(0, '4703.901')] [2024-12-13 05:27:10,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 6201344. Throughput: 0: 1131.0. Samples: 6201336. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:27:10,373][03180] Avg episode reward: [(0, '4703.329')] [2024-12-13 05:27:10,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012112_6201344.pth... [2024-12-13 05:27:10,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012048_6168576.pth [2024-12-13 05:27:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6209536. Throughput: 0: 1129.8. Samples: 6209084. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:27:15,371][03180] Avg episode reward: [(0, '4701.705')] [2024-12-13 05:27:20,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6213632. Throughput: 0: 1142.0. Samples: 6214964. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:27:20,371][03180] Avg episode reward: [(0, '4663.988')] [2024-12-13 05:27:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6217728. Throughput: 0: 1123.3. Samples: 6218048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:27:25,372][03180] Avg episode reward: [(0, '4722.867')] [2024-12-13 05:27:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012144_6217728.pth... [2024-12-13 05:27:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012080_6184960.pth [2024-12-13 05:27:29,121][03226] Updated weights for policy 0, policy_version 12160 (0.0019) [2024-12-13 05:27:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6225920. Throughput: 0: 1122.0. Samples: 6225900. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:27:30,371][03180] Avg episode reward: [(0, '4769.920')] [2024-12-13 05:27:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6230016. Throughput: 0: 1140.2. Samples: 6232128. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:27:35,371][03180] Avg episode reward: [(0, '4805.563')] [2024-12-13 05:27:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 6234112. Throughput: 0: 1122.0. Samples: 6234980. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:27:40,371][03180] Avg episode reward: [(0, '4895.174')] [2024-12-13 05:27:40,439][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012184_6238208.pth... [2024-12-13 05:27:40,452][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012112_6201344.pth [2024-12-13 05:27:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6242304. Throughput: 0: 1130.0. Samples: 6242916. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:27:45,371][03180] Avg episode reward: [(0, '4937.173')] [2024-12-13 05:27:50,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 6246400. Throughput: 0: 1155.1. Samples: 6249596. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:27:50,374][03180] Avg episode reward: [(0, '4928.995')] [2024-12-13 05:27:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6254592. Throughput: 0: 1129.0. Samples: 6252140. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:27:55,372][03180] Avg episode reward: [(0, '4978.192')] [2024-12-13 05:27:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012216_6254592.pth... [2024-12-13 05:27:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012144_6217728.pth [2024-12-13 05:28:00,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6258688. Throughput: 0: 1124.2. Samples: 6259672. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:28:00,371][03180] Avg episode reward: [(0, '4999.345')] [2024-12-13 05:28:05,343][03226] Updated weights for policy 0, policy_version 12240 (0.0009) [2024-12-13 05:28:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6266880. Throughput: 0: 1151.1. Samples: 6266764. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:28:05,371][03180] Avg episode reward: [(0, '5007.961')] [2024-12-13 05:28:10,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 6270976. Throughput: 0: 1136.7. Samples: 6269204. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:28:10,376][03180] Avg episode reward: [(0, '5007.974')] [2024-12-13 05:28:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012248_6270976.pth... [2024-12-13 05:28:10,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012184_6238208.pth [2024-12-13 05:28:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6275072. Throughput: 0: 1073.9. Samples: 6274224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:15,371][03180] Avg episode reward: [(0, '5049.836')] [2024-12-13 05:28:20,377][03180] Fps is (10 sec: 819.0, 60 sec: 1092.1, 300 sec: 1110.8). Total num frames: 6279168. Throughput: 0: 1092.4. Samples: 6281292. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:20,378][03180] Avg episode reward: [(0, '5060.090')] [2024-12-13 05:28:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 6283264. Throughput: 0: 1081.2. Samples: 6283636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:25,371][03180] Avg episode reward: [(0, '5084.330')] [2024-12-13 05:28:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012272_6283264.pth... [2024-12-13 05:28:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012216_6254592.pth [2024-12-13 05:28:30,371][03180] Fps is (10 sec: 1229.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6291456. Throughput: 0: 1061.2. Samples: 6290672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:30,371][03180] Avg episode reward: [(0, '5043.208')] [2024-12-13 05:28:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6295552. Throughput: 0: 1080.8. Samples: 6298228. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:35,371][03180] Avg episode reward: [(0, '5032.785')] [2024-12-13 05:28:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 6299648. Throughput: 0: 1078.8. Samples: 6300684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:40,371][03180] Avg episode reward: [(0, '4987.006')] [2024-12-13 05:28:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012304_6299648.pth... [2024-12-13 05:28:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012248_6270976.pth [2024-12-13 05:28:44,374][03226] Updated weights for policy 0, policy_version 12320 (0.0009) [2024-12-13 05:28:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6307840. Throughput: 0: 1062.7. Samples: 6307492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:45,371][03180] Avg episode reward: [(0, '5040.622')] [2024-12-13 05:28:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6311936. Throughput: 0: 1078.1. Samples: 6315280. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:50,371][03180] Avg episode reward: [(0, '5064.870')] [2024-12-13 05:28:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 6316032. Throughput: 0: 1083.5. Samples: 6317956. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:28:55,371][03180] Avg episode reward: [(0, '5118.898')] [2024-12-13 05:28:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012336_6316032.pth... [2024-12-13 05:28:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012272_6283264.pth [2024-12-13 05:28:55,386][03213] Saving new best policy, reward=5118.898! [2024-12-13 05:29:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6324224. Throughput: 0: 1113.5. Samples: 6324332. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:29:00,372][03180] Avg episode reward: [(0, '5132.187')] [2024-12-13 05:29:00,381][03213] Saving new best policy, reward=5132.187! [2024-12-13 05:29:05,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6332416. Throughput: 0: 1131.0. Samples: 6332180. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:29:05,371][03180] Avg episode reward: [(0, '5126.166')] [2024-12-13 05:29:10,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6336512. Throughput: 0: 1145.4. Samples: 6335184. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:29:10,374][03180] Avg episode reward: [(0, '5118.767')] [2024-12-13 05:29:10,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012376_6336512.pth... [2024-12-13 05:29:10,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012304_6299648.pth [2024-12-13 05:29:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6340608. Throughput: 0: 1124.5. Samples: 6341276. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:29:15,371][03180] Avg episode reward: [(0, '5072.595')] [2024-12-13 05:29:19,860][03226] Updated weights for policy 0, policy_version 12400 (0.0017) [2024-12-13 05:29:20,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.7, 300 sec: 1124.7). Total num frames: 6348800. Throughput: 0: 1128.4. Samples: 6349004. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:29:20,371][03180] Avg episode reward: [(0, '5074.761')] [2024-12-13 05:29:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6352896. Throughput: 0: 1148.6. Samples: 6352372. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:29:25,371][03180] Avg episode reward: [(0, '5077.331')] [2024-12-13 05:29:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012408_6352896.pth... [2024-12-13 05:29:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012336_6316032.pth [2024-12-13 05:29:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6356992. Throughput: 0: 1122.7. Samples: 6358012. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:29:30,371][03180] Avg episode reward: [(0, '5035.352')] [2024-12-13 05:29:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6365184. Throughput: 0: 1124.2. Samples: 6365868. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:29:35,371][03180] Avg episode reward: [(0, '5032.450')] [2024-12-13 05:29:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6369280. Throughput: 0: 1145.7. Samples: 6369512. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:29:40,372][03180] Avg episode reward: [(0, '5001.677')] [2024-12-13 05:29:40,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012440_6369280.pth... [2024-12-13 05:29:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012376_6336512.pth [2024-12-13 05:29:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6373376. Throughput: 0: 1128.4. Samples: 6375112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:29:45,371][03180] Avg episode reward: [(0, '5005.089')] [2024-12-13 05:29:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6381568. Throughput: 0: 1125.9. Samples: 6382844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:29:50,371][03180] Avg episode reward: [(0, '4987.669')] [2024-12-13 05:29:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6385664. Throughput: 0: 1145.2. Samples: 6386716. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:29:55,371][03180] Avg episode reward: [(0, '4966.897')] [2024-12-13 05:29:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012472_6385664.pth... [2024-12-13 05:29:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012408_6352896.pth [2024-12-13 05:29:56,763][03226] Updated weights for policy 0, policy_version 12480 (0.0010) [2024-12-13 05:30:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6389760. Throughput: 0: 1123.3. Samples: 6391824. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:30:00,371][03180] Avg episode reward: [(0, '4987.556')] [2024-12-13 05:30:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6397952. Throughput: 0: 1118.0. Samples: 6399316. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:30:05,371][03180] Avg episode reward: [(0, '5065.064')] [2024-12-13 05:30:10,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6402048. Throughput: 0: 1129.4. Samples: 6403196. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:30:10,373][03180] Avg episode reward: [(0, '5058.104')] [2024-12-13 05:30:10,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012504_6402048.pth... [2024-12-13 05:30:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012440_6369280.pth [2024-12-13 05:30:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6406144. Throughput: 0: 1128.2. Samples: 6408780. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:30:15,371][03180] Avg episode reward: [(0, '5062.738')] [2024-12-13 05:30:20,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6414336. Throughput: 0: 1119.8. Samples: 6416260. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:30:20,371][03180] Avg episode reward: [(0, '5014.253')] [2024-12-13 05:30:25,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6422528. Throughput: 0: 1124.4. Samples: 6420108. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:30:25,371][03180] Avg episode reward: [(0, '4985.653')] [2024-12-13 05:30:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012544_6422528.pth... [2024-12-13 05:30:25,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012472_6385664.pth [2024-12-13 05:30:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6426624. Throughput: 0: 1125.7. Samples: 6425768. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:30:30,371][03180] Avg episode reward: [(0, '4983.707')] [2024-12-13 05:30:33,458][03226] Updated weights for policy 0, policy_version 12560 (0.0012) [2024-12-13 05:30:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6430720. Throughput: 0: 1111.0. Samples: 6432840. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:30:35,371][03180] Avg episode reward: [(0, '4933.469')] [2024-12-13 05:30:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6438912. Throughput: 0: 1110.0. Samples: 6436664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:30:40,371][03180] Avg episode reward: [(0, '4930.579')] [2024-12-13 05:30:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012576_6438912.pth... [2024-12-13 05:30:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012504_6402048.pth [2024-12-13 05:30:45,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6443008. Throughput: 0: 1135.4. Samples: 6442920. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:30:45,374][03180] Avg episode reward: [(0, '4986.186')] [2024-12-13 05:30:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6447104. Throughput: 0: 1115.6. Samples: 6449520. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:30:50,371][03180] Avg episode reward: [(0, '4975.570')] [2024-12-13 05:30:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6455296. Throughput: 0: 1116.0. Samples: 6453416. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:30:55,371][03180] Avg episode reward: [(0, '4994.699')] [2024-12-13 05:30:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012608_6455296.pth... [2024-12-13 05:30:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012544_6422528.pth [2024-12-13 05:31:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6459392. Throughput: 0: 1140.0. Samples: 6460080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:31:00,371][03180] Avg episode reward: [(0, '5046.312')] [2024-12-13 05:31:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6463488. Throughput: 0: 1117.7. Samples: 6466556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:31:05,371][03180] Avg episode reward: [(0, '5054.036')] [2024-12-13 05:31:09,062][03226] Updated weights for policy 0, policy_version 12640 (0.0009) [2024-12-13 05:31:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 6471680. Throughput: 0: 1118.5. Samples: 6470440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:31:10,371][03180] Avg episode reward: [(0, '5068.875')] [2024-12-13 05:31:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012640_6471680.pth... [2024-12-13 05:31:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012576_6438912.pth [2024-12-13 05:31:15,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6475776. Throughput: 0: 1147.9. Samples: 6477424. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:31:15,373][03180] Avg episode reward: [(0, '5072.151')] [2024-12-13 05:31:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6479872. Throughput: 0: 1125.1. Samples: 6483468. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:31:20,371][03180] Avg episode reward: [(0, '5017.623')] [2024-12-13 05:31:25,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6488064. Throughput: 0: 1126.9. Samples: 6487376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:31:25,371][03180] Avg episode reward: [(0, '4963.157')] [2024-12-13 05:31:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012672_6488064.pth... [2024-12-13 05:31:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012608_6455296.pth [2024-12-13 05:31:30,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6492160. Throughput: 0: 1148.6. Samples: 6494604. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:31:30,372][03180] Avg episode reward: [(0, '4980.271')] [2024-12-13 05:31:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6500352. Throughput: 0: 1129.2. Samples: 6500332. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:31:35,371][03180] Avg episode reward: [(0, '4991.326')] [2024-12-13 05:31:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6504448. Throughput: 0: 1129.3. Samples: 6504236. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:31:40,371][03180] Avg episode reward: [(0, '4897.737')] [2024-12-13 05:31:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012704_6504448.pth... [2024-12-13 05:31:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012640_6471680.pth [2024-12-13 05:31:45,264][03226] Updated weights for policy 0, policy_version 12720 (0.0010) [2024-12-13 05:31:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 6512640. Throughput: 0: 1146.8. Samples: 6511688. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:31:45,380][03180] Avg episode reward: [(0, '4900.937')] [2024-12-13 05:31:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6516736. Throughput: 0: 1124.5. Samples: 6517160. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:31:50,371][03180] Avg episode reward: [(0, '4842.906')] [2024-12-13 05:31:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6520832. Throughput: 0: 1122.6. Samples: 6520956. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:31:55,371][03180] Avg episode reward: [(0, '4852.478')] [2024-12-13 05:31:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012736_6520832.pth... [2024-12-13 05:31:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012672_6488064.pth [2024-12-13 05:32:00,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 6529024. Throughput: 0: 1143.3. Samples: 6528876. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:00,376][03180] Avg episode reward: [(0, '4841.640')] [2024-12-13 05:32:05,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.4, 300 sec: 1124.7). Total num frames: 6533120. Throughput: 0: 1124.8. Samples: 6534088. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:32:05,376][03180] Avg episode reward: [(0, '4832.910')] [2024-12-13 05:32:10,372][03180] Fps is (10 sec: 819.4, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 6537216. Throughput: 0: 1122.4. Samples: 6537884. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:32:10,373][03180] Avg episode reward: [(0, '4836.729')] [2024-12-13 05:32:10,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012768_6537216.pth... [2024-12-13 05:32:10,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012704_6504448.pth [2024-12-13 05:32:15,371][03180] Fps is (10 sec: 819.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6541312. Throughput: 0: 1088.0. Samples: 6543564. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:32:15,371][03180] Avg episode reward: [(0, '4838.394')] [2024-12-13 05:32:20,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6545408. Throughput: 0: 1074.0. Samples: 6548660. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:20,371][03180] Avg episode reward: [(0, '4836.668')] [2024-12-13 05:32:24,082][03226] Updated weights for policy 0, policy_version 12800 (0.0009) [2024-12-13 05:32:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6553600. Throughput: 0: 1068.8. Samples: 6552332. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:25,371][03180] Avg episode reward: [(0, '4754.242')] [2024-12-13 05:32:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012800_6553600.pth... [2024-12-13 05:32:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012736_6520832.pth [2024-12-13 05:32:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6557696. Throughput: 0: 1073.7. Samples: 6560004. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:30,371][03180] Avg episode reward: [(0, '4691.661')] [2024-12-13 05:32:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6565888. Throughput: 0: 1082.3. Samples: 6565864. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:35,371][03180] Avg episode reward: [(0, '4686.344')] [2024-12-13 05:32:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6569984. Throughput: 0: 1074.5. Samples: 6569308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:40,371][03180] Avg episode reward: [(0, '4683.467')] [2024-12-13 05:32:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012832_6569984.pth... [2024-12-13 05:32:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012768_6537216.pth [2024-12-13 05:32:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6578176. Throughput: 0: 1069.1. Samples: 6576980. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:45,374][03180] Avg episode reward: [(0, '4692.325')] [2024-12-13 05:32:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6582272. Throughput: 0: 1084.3. Samples: 6582876. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:50,371][03180] Avg episode reward: [(0, '4693.050')] [2024-12-13 05:32:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6586368. Throughput: 0: 1072.7. Samples: 6586156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:32:55,375][03180] Avg episode reward: [(0, '4767.509')] [2024-12-13 05:32:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012864_6586368.pth... [2024-12-13 05:32:55,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012800_6553600.pth [2024-12-13 05:32:59,547][03226] Updated weights for policy 0, policy_version 12880 (0.0009) [2024-12-13 05:33:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 6594560. Throughput: 0: 1119.6. Samples: 6593948. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:33:00,371][03180] Avg episode reward: [(0, '4726.728')] [2024-12-13 05:33:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6598656. Throughput: 0: 1142.6. Samples: 6600076. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:33:05,371][03180] Avg episode reward: [(0, '4817.465')] [2024-12-13 05:33:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6602752. Throughput: 0: 1126.8. Samples: 6603036. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:33:10,371][03180] Avg episode reward: [(0, '4803.178')] [2024-12-13 05:33:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012896_6602752.pth... [2024-12-13 05:33:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012832_6569984.pth [2024-12-13 05:33:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6610944. Throughput: 0: 1132.1. Samples: 6610948. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:33:15,371][03180] Avg episode reward: [(0, '4809.959')] [2024-12-13 05:33:20,378][03180] Fps is (10 sec: 1227.9, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 6615040. Throughput: 0: 1146.9. Samples: 6617484. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:33:20,379][03180] Avg episode reward: [(0, '4841.113')] [2024-12-13 05:33:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6619136. Throughput: 0: 1130.0. Samples: 6620156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:33:25,371][03180] Avg episode reward: [(0, '4778.948')] [2024-12-13 05:33:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012928_6619136.pth... [2024-12-13 05:33:25,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012864_6586368.pth [2024-12-13 05:33:30,371][03180] Fps is (10 sec: 1229.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6627328. Throughput: 0: 1131.2. Samples: 6627884. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:33:30,371][03180] Avg episode reward: [(0, '4754.143')] [2024-12-13 05:33:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6631424. Throughput: 0: 1154.1. Samples: 6634812. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:33:35,373][03180] Avg episode reward: [(0, '4743.610')] [2024-12-13 05:33:36,010][03226] Updated weights for policy 0, policy_version 12960 (0.0009) [2024-12-13 05:33:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6639616. Throughput: 0: 1138.0. Samples: 6637368. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:33:40,371][03180] Avg episode reward: [(0, '4737.788')] [2024-12-13 05:33:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000012968_6639616.pth... [2024-12-13 05:33:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012896_6602752.pth [2024-12-13 05:33:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6643712. Throughput: 0: 1137.0. Samples: 6645112. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:33:45,371][03180] Avg episode reward: [(0, '4813.363')] [2024-12-13 05:33:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6651904. Throughput: 0: 1157.2. Samples: 6652152. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:33:50,371][03180] Avg episode reward: [(0, '4775.708')] [2024-12-13 05:33:55,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6656000. Throughput: 0: 1150.9. Samples: 6654828. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:33:55,373][03180] Avg episode reward: [(0, '4776.802')] [2024-12-13 05:33:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013000_6656000.pth... [2024-12-13 05:33:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012928_6619136.pth [2024-12-13 05:34:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6660096. Throughput: 0: 1134.0. Samples: 6661976. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:34:00,371][03180] Avg episode reward: [(0, '4764.670')] [2024-12-13 05:34:05,372][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6668288. Throughput: 0: 1152.4. Samples: 6669336. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:34:05,373][03180] Avg episode reward: [(0, '4815.707')] [2024-12-13 05:34:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6672384. Throughput: 0: 1153.1. Samples: 6672044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:10,371][03180] Avg episode reward: [(0, '4900.562')] [2024-12-13 05:34:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013032_6672384.pth... [2024-12-13 05:34:10,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000012968_6639616.pth [2024-12-13 05:34:12,205][03226] Updated weights for policy 0, policy_version 13040 (0.0008) [2024-12-13 05:34:15,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6680576. Throughput: 0: 1134.2. Samples: 6678924. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:15,371][03180] Avg episode reward: [(0, '4743.845')] [2024-12-13 05:34:20,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 6684672. Throughput: 0: 1156.2. Samples: 6686844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:20,375][03180] Avg episode reward: [(0, '4817.389')] [2024-12-13 05:34:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6688768. Throughput: 0: 1152.9. Samples: 6689248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:25,371][03180] Avg episode reward: [(0, '4817.738')] [2024-12-13 05:34:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013064_6688768.pth... [2024-12-13 05:34:25,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013000_6656000.pth [2024-12-13 05:34:30,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6696960. Throughput: 0: 1131.7. Samples: 6696040. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:30,371][03180] Avg episode reward: [(0, '4832.130')] [2024-12-13 05:34:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6701056. Throughput: 0: 1149.5. Samples: 6703880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:35,371][03180] Avg episode reward: [(0, '4866.899')] [2024-12-13 05:34:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6705152. Throughput: 0: 1149.9. Samples: 6706572. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:40,372][03180] Avg episode reward: [(0, '4880.856')] [2024-12-13 05:34:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013096_6705152.pth... [2024-12-13 05:34:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013032_6672384.pth [2024-12-13 05:34:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6713344. Throughput: 0: 1138.3. Samples: 6713200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:45,371][03180] Avg episode reward: [(0, '4822.490')] [2024-12-13 05:34:47,308][03226] Updated weights for policy 0, policy_version 13120 (0.0011) [2024-12-13 05:34:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6717440. Throughput: 0: 1150.8. Samples: 6721120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:34:50,371][03180] Avg episode reward: [(0, '4960.371')] [2024-12-13 05:34:55,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6725632. Throughput: 0: 1158.7. Samples: 6724192. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:34:55,376][03180] Avg episode reward: [(0, '4956.363')] [2024-12-13 05:34:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013136_6725632.pth... [2024-12-13 05:34:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013064_6688768.pth [2024-12-13 05:35:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6729728. Throughput: 0: 1140.1. Samples: 6730228. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:35:00,371][03180] Avg episode reward: [(0, '4991.359')] [2024-12-13 05:35:05,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1160.6, 300 sec: 1138.6). Total num frames: 6737920. Throughput: 0: 1139.6. Samples: 6738124. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:35:05,371][03180] Avg episode reward: [(0, '4964.353')] [2024-12-13 05:35:10,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6742016. Throughput: 0: 1164.3. Samples: 6741644. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:35:10,373][03180] Avg episode reward: [(0, '5005.890')] [2024-12-13 05:35:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013168_6742016.pth... [2024-12-13 05:35:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013096_6705152.pth [2024-12-13 05:35:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6746112. Throughput: 0: 1141.4. Samples: 6747404. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:35:15,371][03180] Avg episode reward: [(0, '5055.335')] [2024-12-13 05:35:20,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 6754304. Throughput: 0: 1142.6. Samples: 6755296. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:35:20,371][03180] Avg episode reward: [(0, '5114.106')] [2024-12-13 05:35:22,402][03226] Updated weights for policy 0, policy_version 13200 (0.0015) [2024-12-13 05:35:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6758400. Throughput: 0: 1164.4. Samples: 6758968. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:35:25,371][03180] Avg episode reward: [(0, '5079.802')] [2024-12-13 05:35:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013200_6758400.pth... [2024-12-13 05:35:25,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013136_6725632.pth [2024-12-13 05:35:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6762496. Throughput: 0: 1138.6. Samples: 6764436. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:35:30,371][03180] Avg episode reward: [(0, '5040.249')] [2024-12-13 05:35:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6770688. Throughput: 0: 1138.8. Samples: 6772368. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:35:35,371][03180] Avg episode reward: [(0, '5109.393')] [2024-12-13 05:35:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6774784. Throughput: 0: 1159.1. Samples: 6776348. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:35:40,372][03180] Avg episode reward: [(0, '5163.778')] [2024-12-13 05:35:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013232_6774784.pth... [2024-12-13 05:35:40,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013168_6742016.pth [2024-12-13 05:35:40,394][03213] Saving new best policy, reward=5163.778! [2024-12-13 05:35:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6782976. Throughput: 0: 1143.5. Samples: 6781684. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:35:45,371][03180] Avg episode reward: [(0, '5163.290')] [2024-12-13 05:35:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6787072. Throughput: 0: 1143.4. Samples: 6789576. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:35:50,371][03180] Avg episode reward: [(0, '5106.206')] [2024-12-13 05:35:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 6795264. Throughput: 0: 1153.5. Samples: 6793548. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:35:55,371][03180] Avg episode reward: [(0, '5135.911')] [2024-12-13 05:35:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013272_6795264.pth... [2024-12-13 05:35:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013200_6758400.pth [2024-12-13 05:35:59,529][03226] Updated weights for policy 0, policy_version 13280 (0.0011) [2024-12-13 05:36:00,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.4, 300 sec: 1138.5). Total num frames: 6799360. Throughput: 0: 1147.6. Samples: 6799052. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:36:00,376][03180] Avg episode reward: [(0, '5135.227')] [2024-12-13 05:36:05,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 6803456. Throughput: 0: 1139.1. Samples: 6806556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:36:05,373][03180] Avg episode reward: [(0, '5156.226')] [2024-12-13 05:36:10,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1160.6, 300 sec: 1138.6). Total num frames: 6811648. Throughput: 0: 1142.2. Samples: 6810368. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:36:10,375][03180] Avg episode reward: [(0, '5219.031')] [2024-12-13 05:36:10,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013304_6811648.pth... [2024-12-13 05:36:10,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013232_6774784.pth [2024-12-13 05:36:10,392][03213] Saving new best policy, reward=5219.031! [2024-12-13 05:36:15,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6815744. Throughput: 0: 1135.8. Samples: 6815548. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:36:15,371][03180] Avg episode reward: [(0, '5216.477')] [2024-12-13 05:36:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6819840. Throughput: 0: 1066.2. Samples: 6820348. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:36:20,371][03180] Avg episode reward: [(0, '5225.434')] [2024-12-13 05:36:20,372][03213] Saving new best policy, reward=5225.434! [2024-12-13 05:36:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6823936. Throughput: 0: 1065.7. Samples: 6824304. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:36:25,371][03180] Avg episode reward: [(0, '5249.053')] [2024-12-13 05:36:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013328_6823936.pth... [2024-12-13 05:36:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013272_6795264.pth [2024-12-13 05:36:25,384][03213] Saving new best policy, reward=5249.053! [2024-12-13 05:36:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6832128. Throughput: 0: 1112.7. Samples: 6831756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:36:30,371][03180] Avg episode reward: [(0, '5246.817')] [2024-12-13 05:36:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6836224. Throughput: 0: 1062.8. Samples: 6837400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:36:35,371][03180] Avg episode reward: [(0, '5253.311')] [2024-12-13 05:36:35,372][03213] Saving new best policy, reward=5253.311! [2024-12-13 05:36:37,288][03226] Updated weights for policy 0, policy_version 13360 (0.0027) [2024-12-13 05:36:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6840320. Throughput: 0: 1061.2. Samples: 6841300. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:36:40,372][03180] Avg episode reward: [(0, '5250.944')] [2024-12-13 05:36:40,388][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013368_6844416.pth... [2024-12-13 05:36:40,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013304_6811648.pth [2024-12-13 05:36:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6848512. Throughput: 0: 1109.3. Samples: 6848964. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:36:45,371][03180] Avg episode reward: [(0, '5256.910')] [2024-12-13 05:36:45,372][03213] Saving new best policy, reward=5256.910! [2024-12-13 05:36:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6852608. Throughput: 0: 1060.5. Samples: 6854276. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:36:50,371][03180] Avg episode reward: [(0, '5274.199')] [2024-12-13 05:36:50,372][03213] Saving new best policy, reward=5274.199! [2024-12-13 05:36:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6860800. Throughput: 0: 1061.2. Samples: 6858124. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:36:55,371][03180] Avg episode reward: [(0, '5288.662')] [2024-12-13 05:36:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013400_6860800.pth... [2024-12-13 05:36:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013328_6823936.pth [2024-12-13 05:36:55,387][03213] Saving new best policy, reward=5288.662! [2024-12-13 05:37:00,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6864896. Throughput: 0: 1115.1. Samples: 6865732. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:37:00,375][03180] Avg episode reward: [(0, '5289.568')] [2024-12-13 05:37:00,376][03213] Saving new best policy, reward=5289.568! [2024-12-13 05:37:05,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6868992. Throughput: 0: 1132.4. Samples: 6871308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:37:05,373][03180] Avg episode reward: [(0, '5326.027')] [2024-12-13 05:37:05,374][03213] Saving new best policy, reward=5326.027! [2024-12-13 05:37:10,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6877184. Throughput: 0: 1123.1. Samples: 6874844. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:37:10,371][03180] Avg episode reward: [(0, '5263.397')] [2024-12-13 05:37:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013432_6877184.pth... [2024-12-13 05:37:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013368_6844416.pth [2024-12-13 05:37:13,018][03226] Updated weights for policy 0, policy_version 13440 (0.0009) [2024-12-13 05:37:15,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6881280. Throughput: 0: 1131.3. Samples: 6882664. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:37:15,371][03180] Avg episode reward: [(0, '5318.142')] [2024-12-13 05:37:20,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 6885376. Throughput: 0: 1134.6. Samples: 6888460. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:37:20,372][03180] Avg episode reward: [(0, '5313.325')] [2024-12-13 05:37:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6893568. Throughput: 0: 1124.4. Samples: 6891900. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:37:25,371][03180] Avg episode reward: [(0, '5296.805')] [2024-12-13 05:37:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013464_6893568.pth... [2024-12-13 05:37:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013400_6860800.pth [2024-12-13 05:37:30,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6897664. Throughput: 0: 1124.3. Samples: 6899556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:37:30,373][03180] Avg episode reward: [(0, '5233.997')] [2024-12-13 05:37:35,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6905856. Throughput: 0: 1144.8. Samples: 6905792. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:37:35,372][03180] Avg episode reward: [(0, '5233.110')] [2024-12-13 05:37:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6909952. Throughput: 0: 1127.8. Samples: 6908876. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:37:40,371][03180] Avg episode reward: [(0, '5201.127')] [2024-12-13 05:37:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013496_6909952.pth... [2024-12-13 05:37:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013432_6877184.pth [2024-12-13 05:37:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6914048. Throughput: 0: 1128.3. Samples: 6916504. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:37:45,372][03180] Avg episode reward: [(0, '5197.502')] [2024-12-13 05:37:49,813][03226] Updated weights for policy 0, policy_version 13520 (0.0014) [2024-12-13 05:37:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6922240. Throughput: 0: 1136.1. Samples: 6922432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:37:50,371][03180] Avg episode reward: [(0, '5169.703')] [2024-12-13 05:37:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6926336. Throughput: 0: 1119.6. Samples: 6925224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:37:55,371][03180] Avg episode reward: [(0, '5190.095')] [2024-12-13 05:37:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013528_6926336.pth... [2024-12-13 05:37:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013464_6893568.pth [2024-12-13 05:38:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6930432. Throughput: 0: 1114.8. Samples: 6932828. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:38:00,371][03180] Avg episode reward: [(0, '5149.056')] [2024-12-13 05:38:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 6938624. Throughput: 0: 1132.6. Samples: 6939428. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:38:05,371][03180] Avg episode reward: [(0, '5128.105')] [2024-12-13 05:38:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6942720. Throughput: 0: 1114.2. Samples: 6942040. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:38:10,371][03180] Avg episode reward: [(0, '5130.650')] [2024-12-13 05:38:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013560_6942720.pth... [2024-12-13 05:38:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013496_6909952.pth [2024-12-13 05:38:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.6). Total num frames: 6950912. Throughput: 0: 1117.2. Samples: 6949828. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:38:15,371][03180] Avg episode reward: [(0, '5138.002')] [2024-12-13 05:38:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 6955008. Throughput: 0: 1133.7. Samples: 6956808. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:38:20,371][03180] Avg episode reward: [(0, '5154.015')] [2024-12-13 05:38:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6959104. Throughput: 0: 1118.3. Samples: 6959200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:38:25,371][03180] Avg episode reward: [(0, '5175.800')] [2024-12-13 05:38:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013592_6959104.pth... [2024-12-13 05:38:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013528_6926336.pth [2024-12-13 05:38:26,338][03226] Updated weights for policy 0, policy_version 13600 (0.0010) [2024-12-13 05:38:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 6967296. Throughput: 0: 1115.3. Samples: 6966692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:38:30,371][03180] Avg episode reward: [(0, '5171.885')] [2024-12-13 05:38:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6971392. Throughput: 0: 1141.2. Samples: 6973784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:38:35,371][03180] Avg episode reward: [(0, '5156.134')] [2024-12-13 05:38:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6975488. Throughput: 0: 1132.2. Samples: 6976172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:38:40,372][03180] Avg episode reward: [(0, '5201.554')] [2024-12-13 05:38:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013624_6975488.pth... [2024-12-13 05:38:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013560_6942720.pth [2024-12-13 05:38:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 6983680. Throughput: 0: 1127.5. Samples: 6983564. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:38:45,371][03180] Avg episode reward: [(0, '5117.688')] [2024-12-13 05:38:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6987776. Throughput: 0: 1145.5. Samples: 6990976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:38:50,371][03180] Avg episode reward: [(0, '5142.701')] [2024-12-13 05:38:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 6991872. Throughput: 0: 1143.2. Samples: 6993484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:38:55,372][03180] Avg episode reward: [(0, '5143.436')] [2024-12-13 05:38:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013656_6991872.pth... [2024-12-13 05:38:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013592_6959104.pth [2024-12-13 05:39:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7000064. Throughput: 0: 1121.0. Samples: 7000272. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:39:00,371][03180] Avg episode reward: [(0, '5215.519')] [2024-12-13 05:39:02,043][03226] Updated weights for policy 0, policy_version 13680 (0.0009) [2024-12-13 05:39:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7004160. Throughput: 0: 1139.1. Samples: 7008068. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:39:05,371][03180] Avg episode reward: [(0, '5224.070')] [2024-12-13 05:39:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7012352. Throughput: 0: 1140.8. Samples: 7010536. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:39:10,371][03180] Avg episode reward: [(0, '5231.481')] [2024-12-13 05:39:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013696_7012352.pth... [2024-12-13 05:39:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013624_6975488.pth [2024-12-13 05:39:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7016448. Throughput: 0: 1125.0. Samples: 7017316. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:39:15,371][03180] Avg episode reward: [(0, '5209.606')] [2024-12-13 05:39:20,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 7024640. Throughput: 0: 1143.9. Samples: 7025264. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:39:20,374][03180] Avg episode reward: [(0, '5240.880')] [2024-12-13 05:39:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7028736. Throughput: 0: 1154.1. Samples: 7028108. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:39:25,371][03180] Avg episode reward: [(0, '5257.010')] [2024-12-13 05:39:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013728_7028736.pth... [2024-12-13 05:39:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013656_6991872.pth [2024-12-13 05:39:30,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7032832. Throughput: 0: 1130.8. Samples: 7034448. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:39:30,371][03180] Avg episode reward: [(0, '5240.545')] [2024-12-13 05:39:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 7041024. Throughput: 0: 1136.3. Samples: 7042108. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:39:35,373][03180] Avg episode reward: [(0, '5255.746')] [2024-12-13 05:39:38,059][03226] Updated weights for policy 0, policy_version 13760 (0.0009) [2024-12-13 05:39:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7045120. Throughput: 0: 1147.5. Samples: 7045120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:39:40,372][03180] Avg episode reward: [(0, '5235.725')] [2024-12-13 05:39:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013760_7045120.pth... [2024-12-13 05:39:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013696_7012352.pth [2024-12-13 05:39:45,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 7049216. Throughput: 0: 1133.1. Samples: 7051264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:39:45,373][03180] Avg episode reward: [(0, '5150.915')] [2024-12-13 05:39:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7057408. Throughput: 0: 1134.8. Samples: 7059132. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:39:50,371][03180] Avg episode reward: [(0, '5124.465')] [2024-12-13 05:39:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7061504. Throughput: 0: 1152.2. Samples: 7062384. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:39:55,371][03180] Avg episode reward: [(0, '5105.950')] [2024-12-13 05:39:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013792_7061504.pth... [2024-12-13 05:39:55,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013728_7028736.pth [2024-12-13 05:40:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7069696. Throughput: 0: 1129.8. Samples: 7068156. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:00,371][03180] Avg episode reward: [(0, '5077.842')] [2024-12-13 05:40:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7073792. Throughput: 0: 1125.8. Samples: 7075924. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:05,371][03180] Avg episode reward: [(0, '5089.132')] [2024-12-13 05:40:10,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 7077888. Throughput: 0: 1138.0. Samples: 7079320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:10,375][03180] Avg episode reward: [(0, '5071.063')] [2024-12-13 05:40:10,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013824_7077888.pth... [2024-12-13 05:40:10,403][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013760_7045120.pth [2024-12-13 05:40:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7081984. Throughput: 0: 1122.0. Samples: 7084940. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:40:15,371][03180] Avg episode reward: [(0, '5067.785')] [2024-12-13 05:40:15,473][03226] Updated weights for policy 0, policy_version 13840 (0.0009) [2024-12-13 05:40:20,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7090176. Throughput: 0: 1069.4. Samples: 7090232. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:40:20,371][03180] Avg episode reward: [(0, '4951.482')] [2024-12-13 05:40:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7094272. Throughput: 0: 1086.0. Samples: 7093988. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:25,372][03180] Avg episode reward: [(0, '4948.371')] [2024-12-13 05:40:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013856_7094272.pth... [2024-12-13 05:40:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013792_7061504.pth [2024-12-13 05:40:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7098368. Throughput: 0: 1067.3. Samples: 7099292. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:30,371][03180] Avg episode reward: [(0, '4916.540')] [2024-12-13 05:40:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7106560. Throughput: 0: 1067.6. Samples: 7107172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:35,371][03180] Avg episode reward: [(0, '4945.684')] [2024-12-13 05:40:40,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7110656. Throughput: 0: 1080.6. Samples: 7111012. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:40,372][03180] Avg episode reward: [(0, '4912.440')] [2024-12-13 05:40:40,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013888_7110656.pth... [2024-12-13 05:40:40,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013824_7077888.pth [2024-12-13 05:40:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7114752. Throughput: 0: 1072.6. Samples: 7116424. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:45,372][03180] Avg episode reward: [(0, '4916.621')] [2024-12-13 05:40:50,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7122944. Throughput: 0: 1071.1. Samples: 7124124. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:50,371][03180] Avg episode reward: [(0, '4911.631')] [2024-12-13 05:40:52,280][03226] Updated weights for policy 0, policy_version 13920 (0.0010) [2024-12-13 05:40:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7127040. Throughput: 0: 1082.6. Samples: 7128036. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:40:55,371][03180] Avg episode reward: [(0, '4928.025')] [2024-12-13 05:40:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013920_7127040.pth... [2024-12-13 05:40:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013856_7094272.pth [2024-12-13 05:41:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 7131136. Throughput: 0: 1079.5. Samples: 7133516. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:41:00,371][03180] Avg episode reward: [(0, '4919.146')] [2024-12-13 05:41:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7139328. Throughput: 0: 1128.3. Samples: 7141004. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:41:05,371][03180] Avg episode reward: [(0, '4908.133')] [2024-12-13 05:41:10,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 7147520. Throughput: 0: 1132.8. Samples: 7144964. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:41:10,372][03180] Avg episode reward: [(0, '4928.830')] [2024-12-13 05:41:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013960_7147520.pth... [2024-12-13 05:41:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013888_7110656.pth [2024-12-13 05:41:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7151616. Throughput: 0: 1145.2. Samples: 7150824. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:41:15,371][03180] Avg episode reward: [(0, '4917.338')] [2024-12-13 05:41:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7155712. Throughput: 0: 1134.8. Samples: 7158240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:41:20,371][03180] Avg episode reward: [(0, '4943.306')] [2024-12-13 05:41:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7163904. Throughput: 0: 1134.4. Samples: 7162060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:41:25,371][03180] Avg episode reward: [(0, '4949.648')] [2024-12-13 05:41:25,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000013992_7163904.pth... [2024-12-13 05:41:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013920_7127040.pth [2024-12-13 05:41:29,088][03226] Updated weights for policy 0, policy_version 14000 (0.0009) [2024-12-13 05:41:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7168000. Throughput: 0: 1146.1. Samples: 7168000. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:41:30,379][03180] Avg episode reward: [(0, '4952.076')] [2024-12-13 05:41:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7172096. Throughput: 0: 1133.5. Samples: 7175132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:41:35,371][03180] Avg episode reward: [(0, '5068.008')] [2024-12-13 05:41:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7180288. Throughput: 0: 1131.3. Samples: 7178944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:41:40,371][03180] Avg episode reward: [(0, '5069.714')] [2024-12-13 05:41:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014024_7180288.pth... [2024-12-13 05:41:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013960_7147520.pth [2024-12-13 05:41:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7184384. Throughput: 0: 1154.3. Samples: 7185460. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:41:45,371][03180] Avg episode reward: [(0, '5057.497')] [2024-12-13 05:41:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7188480. Throughput: 0: 1135.8. Samples: 7192116. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:41:50,372][03180] Avg episode reward: [(0, '5110.138')] [2024-12-13 05:41:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7196672. Throughput: 0: 1132.1. Samples: 7195908. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:41:55,371][03180] Avg episode reward: [(0, '5087.968')] [2024-12-13 05:41:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014056_7196672.pth... [2024-12-13 05:41:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000013992_7163904.pth [2024-12-13 05:42:00,378][03180] Fps is (10 sec: 1227.9, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 7200768. Throughput: 0: 1147.5. Samples: 7202468. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:42:00,379][03180] Avg episode reward: [(0, '5103.718')] [2024-12-13 05:42:05,372][03226] Updated weights for policy 0, policy_version 14080 (0.0012) [2024-12-13 05:42:05,379][03180] Fps is (10 sec: 1227.8, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 7208960. Throughput: 0: 1119.3. Samples: 7208616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:05,380][03180] Avg episode reward: [(0, '5102.171')] [2024-12-13 05:42:10,371][03180] Fps is (10 sec: 1229.7, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7213056. Throughput: 0: 1120.5. Samples: 7212484. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:10,371][03180] Avg episode reward: [(0, '5054.114')] [2024-12-13 05:42:10,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014088_7213056.pth... [2024-12-13 05:42:10,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014024_7180288.pth [2024-12-13 05:42:15,371][03180] Fps is (10 sec: 819.9, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7217152. Throughput: 0: 1145.7. Samples: 7219556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:15,371][03180] Avg episode reward: [(0, '5055.991')] [2024-12-13 05:42:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7225344. Throughput: 0: 1118.5. Samples: 7225464. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:42:20,371][03180] Avg episode reward: [(0, '5062.325')] [2024-12-13 05:42:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7229440. Throughput: 0: 1120.8. Samples: 7229380. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:42:25,371][03180] Avg episode reward: [(0, '5056.340')] [2024-12-13 05:42:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014120_7229440.pth... [2024-12-13 05:42:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014056_7196672.pth [2024-12-13 05:42:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7237632. Throughput: 0: 1136.1. Samples: 7236584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:30,378][03180] Avg episode reward: [(0, '5035.800')] [2024-12-13 05:42:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7241728. Throughput: 0: 1106.9. Samples: 7241928. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:35,371][03180] Avg episode reward: [(0, '4997.990')] [2024-12-13 05:42:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7245824. Throughput: 0: 1109.2. Samples: 7245824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:40,371][03180] Avg episode reward: [(0, '4979.329')] [2024-12-13 05:42:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014152_7245824.pth... [2024-12-13 05:42:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014088_7213056.pth [2024-12-13 05:42:41,293][03226] Updated weights for policy 0, policy_version 14160 (0.0011) [2024-12-13 05:42:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7254016. Throughput: 0: 1135.2. Samples: 7253544. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:45,372][03180] Avg episode reward: [(0, '4985.949')] [2024-12-13 05:42:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7258112. Throughput: 0: 1113.6. Samples: 7258720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:50,371][03180] Avg episode reward: [(0, '5050.367')] [2024-12-13 05:42:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7262208. Throughput: 0: 1113.0. Samples: 7262568. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:42:55,375][03180] Avg episode reward: [(0, '5015.336')] [2024-12-13 05:42:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014184_7262208.pth... [2024-12-13 05:42:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014120_7229440.pth [2024-12-13 05:43:00,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 7270400. Throughput: 0: 1127.3. Samples: 7270288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:43:00,374][03180] Avg episode reward: [(0, '4966.655')] [2024-12-13 05:43:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 7274496. Throughput: 0: 1112.2. Samples: 7275512. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:43:05,371][03180] Avg episode reward: [(0, '4968.772')] [2024-12-13 05:43:10,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7278592. Throughput: 0: 1107.6. Samples: 7279220. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:43:10,371][03180] Avg episode reward: [(0, '4877.771')] [2024-12-13 05:43:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014216_7278592.pth... [2024-12-13 05:43:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014152_7245824.pth [2024-12-13 05:43:15,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7286784. Throughput: 0: 1116.9. Samples: 7286848. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:43:15,375][03180] Avg episode reward: [(0, '4873.401')] [2024-12-13 05:43:18,067][03226] Updated weights for policy 0, policy_version 14240 (0.0016) [2024-12-13 05:43:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7290880. Throughput: 0: 1126.0. Samples: 7292600. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:43:20,371][03180] Avg episode reward: [(0, '4870.572')] [2024-12-13 05:43:25,371][03180] Fps is (10 sec: 819.5, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7294976. Throughput: 0: 1113.0. Samples: 7295908. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:43:25,371][03180] Avg episode reward: [(0, '4877.258')] [2024-12-13 05:43:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014248_7294976.pth... [2024-12-13 05:43:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014184_7262208.pth [2024-12-13 05:43:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7303168. Throughput: 0: 1111.3. Samples: 7303552. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:43:30,371][03180] Avg episode reward: [(0, '4859.674')] [2024-12-13 05:43:35,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 7307264. Throughput: 0: 1129.4. Samples: 7309544. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:43:35,373][03180] Avg episode reward: [(0, '4862.315')] [2024-12-13 05:43:40,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7315456. Throughput: 0: 1111.7. Samples: 7312596. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:43:40,373][03180] Avg episode reward: [(0, '4790.644')] [2024-12-13 05:43:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014288_7315456.pth... [2024-12-13 05:43:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014216_7278592.pth [2024-12-13 05:43:45,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7319552. Throughput: 0: 1109.1. Samples: 7320196. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:43:45,371][03180] Avg episode reward: [(0, '4795.766')] [2024-12-13 05:43:50,379][03180] Fps is (10 sec: 818.6, 60 sec: 1092.1, 300 sec: 1124.6). Total num frames: 7323648. Throughput: 0: 1135.7. Samples: 7326628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:43:50,380][03180] Avg episode reward: [(0, '4734.240')] [2024-12-13 05:43:54,994][03226] Updated weights for policy 0, policy_version 14320 (0.0015) [2024-12-13 05:43:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7331840. Throughput: 0: 1113.5. Samples: 7329328. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:43:55,371][03180] Avg episode reward: [(0, '4819.077')] [2024-12-13 05:43:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014320_7331840.pth... [2024-12-13 05:43:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014248_7294976.pth [2024-12-13 05:44:00,373][03180] Fps is (10 sec: 1229.5, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7335936. Throughput: 0: 1113.1. Samples: 7336936. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:44:00,374][03180] Avg episode reward: [(0, '4781.822')] [2024-12-13 05:44:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7340032. Throughput: 0: 1134.7. Samples: 7343660. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:44:05,371][03180] Avg episode reward: [(0, '4682.577')] [2024-12-13 05:44:10,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7348224. Throughput: 0: 1115.2. Samples: 7346092. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:10,371][03180] Avg episode reward: [(0, '4662.451')] [2024-12-13 05:44:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014352_7348224.pth... [2024-12-13 05:44:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014288_7315456.pth [2024-12-13 05:44:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7352320. Throughput: 0: 1110.3. Samples: 7353516. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:15,371][03180] Avg episode reward: [(0, '4667.629')] [2024-12-13 05:44:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7356416. Throughput: 0: 1122.7. Samples: 7360064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:20,371][03180] Avg episode reward: [(0, '4635.575')] [2024-12-13 05:44:25,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 7360512. Throughput: 0: 1098.8. Samples: 7362044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:25,374][03180] Avg episode reward: [(0, '4665.029')] [2024-12-13 05:44:25,387][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014376_7360512.pth... [2024-12-13 05:44:25,395][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014320_7331840.pth [2024-12-13 05:44:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7368704. Throughput: 0: 1050.7. Samples: 7367476. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:30,371][03180] Avg episode reward: [(0, '4647.522')] [2024-12-13 05:44:33,301][03226] Updated weights for policy 0, policy_version 14400 (0.0014) [2024-12-13 05:44:35,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7372800. Throughput: 0: 1076.6. Samples: 7375064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:35,371][03180] Avg episode reward: [(0, '4672.964')] [2024-12-13 05:44:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 7376896. Throughput: 0: 1088.4. Samples: 7378308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:40,371][03180] Avg episode reward: [(0, '4692.834')] [2024-12-13 05:44:40,387][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014408_7376896.pth... [2024-12-13 05:44:40,394][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014352_7348224.pth [2024-12-13 05:44:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7385088. Throughput: 0: 1045.3. Samples: 7383972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:45,371][03180] Avg episode reward: [(0, '4773.634')] [2024-12-13 05:44:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 7389184. Throughput: 0: 1067.2. Samples: 7391684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:50,371][03180] Avg episode reward: [(0, '4746.925')] [2024-12-13 05:44:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 7393280. Throughput: 0: 1092.2. Samples: 7395240. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:44:55,372][03180] Avg episode reward: [(0, '4763.022')] [2024-12-13 05:44:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014440_7393280.pth... [2024-12-13 05:44:55,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014376_7360512.pth [2024-12-13 05:45:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7401472. Throughput: 0: 1047.0. Samples: 7400632. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:45:00,371][03180] Avg episode reward: [(0, '4862.040')] [2024-12-13 05:45:05,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7405568. Throughput: 0: 1070.3. Samples: 7408228. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:45:05,371][03180] Avg episode reward: [(0, '4816.673')] [2024-12-13 05:45:10,043][03226] Updated weights for policy 0, policy_version 14480 (0.0013) [2024-12-13 05:45:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7413760. Throughput: 0: 1110.3. Samples: 7412004. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:45:10,371][03180] Avg episode reward: [(0, '4852.582')] [2024-12-13 05:45:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014480_7413760.pth... [2024-12-13 05:45:10,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014408_7376896.pth [2024-12-13 05:45:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7417856. Throughput: 0: 1102.5. Samples: 7417088. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:45:15,371][03180] Avg episode reward: [(0, '4838.793')] [2024-12-13 05:45:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7421952. Throughput: 0: 1098.2. Samples: 7424484. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:45:20,371][03180] Avg episode reward: [(0, '4808.618')] [2024-12-13 05:45:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 7430144. Throughput: 0: 1110.0. Samples: 7428256. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:45:25,371][03180] Avg episode reward: [(0, '4674.299')] [2024-12-13 05:45:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014512_7430144.pth... [2024-12-13 05:45:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014440_7393280.pth [2024-12-13 05:45:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7434240. Throughput: 0: 1106.2. Samples: 7433752. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:45:30,371][03180] Avg episode reward: [(0, '4635.682')] [2024-12-13 05:45:35,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 7438336. Throughput: 0: 1095.7. Samples: 7440992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:45:35,374][03180] Avg episode reward: [(0, '4736.577')] [2024-12-13 05:45:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7446528. Throughput: 0: 1103.2. Samples: 7444884. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:45:40,371][03180] Avg episode reward: [(0, '4697.040')] [2024-12-13 05:45:40,385][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014544_7446528.pth... [2024-12-13 05:45:40,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014480_7413760.pth [2024-12-13 05:45:45,372][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 7450624. Throughput: 0: 1111.9. Samples: 7450668. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:45:45,373][03180] Avg episode reward: [(0, '4791.743')] [2024-12-13 05:45:47,713][03226] Updated weights for policy 0, policy_version 14560 (0.0008) [2024-12-13 05:45:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7454720. Throughput: 0: 1102.0. Samples: 7457820. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:45:50,371][03180] Avg episode reward: [(0, '4711.490')] [2024-12-13 05:45:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7462912. Throughput: 0: 1102.6. Samples: 7461620. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:45:55,371][03180] Avg episode reward: [(0, '4692.944')] [2024-12-13 05:45:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014576_7462912.pth... [2024-12-13 05:45:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014512_7430144.pth [2024-12-13 05:46:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7467008. Throughput: 0: 1123.4. Samples: 7467640. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:46:00,371][03180] Avg episode reward: [(0, '4782.389')] [2024-12-13 05:46:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7471104. Throughput: 0: 1107.7. Samples: 7474332. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:46:05,371][03180] Avg episode reward: [(0, '4753.488')] [2024-12-13 05:46:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7479296. Throughput: 0: 1106.4. Samples: 7478044. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:46:10,371][03180] Avg episode reward: [(0, '4750.636')] [2024-12-13 05:46:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014608_7479296.pth... [2024-12-13 05:46:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014544_7446528.pth [2024-12-13 05:46:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7483392. Throughput: 0: 1126.8. Samples: 7484460. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:46:15,371][03180] Avg episode reward: [(0, '4744.294')] [2024-12-13 05:46:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7487488. Throughput: 0: 1108.8. Samples: 7490884. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:46:20,371][03180] Avg episode reward: [(0, '4710.757')] [2024-12-13 05:46:23,892][03226] Updated weights for policy 0, policy_version 14640 (0.0009) [2024-12-13 05:46:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7495680. Throughput: 0: 1104.9. Samples: 7494604. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:46:25,371][03180] Avg episode reward: [(0, '4730.403')] [2024-12-13 05:46:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014640_7495680.pth... [2024-12-13 05:46:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014576_7462912.pth [2024-12-13 05:46:30,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 7499776. Throughput: 0: 1126.9. Samples: 7501380. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:46:30,376][03180] Avg episode reward: [(0, '4727.270')] [2024-12-13 05:46:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7503872. Throughput: 0: 1099.9. Samples: 7507316. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:46:35,371][03180] Avg episode reward: [(0, '4732.219')] [2024-12-13 05:46:40,372][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 7512064. Throughput: 0: 1099.6. Samples: 7511104. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:46:40,373][03180] Avg episode reward: [(0, '4786.907')] [2024-12-13 05:46:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014672_7512064.pth... [2024-12-13 05:46:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014608_7479296.pth [2024-12-13 05:46:45,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7516160. Throughput: 0: 1122.0. Samples: 7518132. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:46:45,373][03180] Avg episode reward: [(0, '4890.224')] [2024-12-13 05:46:50,373][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 7520256. Throughput: 0: 1099.6. Samples: 7523816. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:46:50,373][03180] Avg episode reward: [(0, '4888.514')] [2024-12-13 05:46:55,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7528448. Throughput: 0: 1103.0. Samples: 7527680. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:46:55,371][03180] Avg episode reward: [(0, '4903.437')] [2024-12-13 05:46:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014704_7528448.pth... [2024-12-13 05:46:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014640_7495680.pth [2024-12-13 05:47:00,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7532544. Throughput: 0: 1124.9. Samples: 7535080. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:47:00,371][03180] Avg episode reward: [(0, '4930.568')] [2024-12-13 05:47:01,142][03226] Updated weights for policy 0, policy_version 14720 (0.0010) [2024-12-13 05:47:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7536640. Throughput: 0: 1099.9. Samples: 7540380. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:47:05,372][03180] Avg episode reward: [(0, '4884.642')] [2024-12-13 05:47:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7544832. Throughput: 0: 1100.4. Samples: 7544120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:47:10,371][03180] Avg episode reward: [(0, '4834.014')] [2024-12-13 05:47:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014736_7544832.pth... [2024-12-13 05:47:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014672_7512064.pth [2024-12-13 05:47:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7548928. Throughput: 0: 1120.4. Samples: 7551792. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:47:15,371][03180] Avg episode reward: [(0, '4857.174')] [2024-12-13 05:47:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7557120. Throughput: 0: 1101.4. Samples: 7556880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:47:20,371][03180] Avg episode reward: [(0, '4798.387')] [2024-12-13 05:47:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7561216. Throughput: 0: 1102.2. Samples: 7560700. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:47:25,371][03180] Avg episode reward: [(0, '4777.942')] [2024-12-13 05:47:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014768_7561216.pth... [2024-12-13 05:47:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014704_7528448.pth [2024-12-13 05:47:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 7569408. Throughput: 0: 1117.1. Samples: 7568400. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:47:30,371][03180] Avg episode reward: [(0, '4785.775')] [2024-12-13 05:47:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7573504. Throughput: 0: 1108.5. Samples: 7573696. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:47:35,373][03180] Avg episode reward: [(0, '4821.973')] [2024-12-13 05:47:38,273][03226] Updated weights for policy 0, policy_version 14800 (0.0009) [2024-12-13 05:47:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7577600. Throughput: 0: 1104.9. Samples: 7577400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:47:40,371][03180] Avg episode reward: [(0, '4828.360')] [2024-12-13 05:47:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014800_7577600.pth... [2024-12-13 05:47:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014736_7544832.pth [2024-12-13 05:47:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1110.8). Total num frames: 7585792. Throughput: 0: 1110.3. Samples: 7585044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:47:45,371][03180] Avg episode reward: [(0, '4812.447')] [2024-12-13 05:47:50,393][03180] Fps is (10 sec: 1226.0, 60 sec: 1160.1, 300 sec: 1110.7). Total num frames: 7589888. Throughput: 0: 1115.3. Samples: 7590596. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:47:50,394][03180] Avg episode reward: [(0, '4824.797')] [2024-12-13 05:47:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7593984. Throughput: 0: 1104.6. Samples: 7593828. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:47:55,371][03180] Avg episode reward: [(0, '4776.361')] [2024-12-13 05:47:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014832_7593984.pth... [2024-12-13 05:47:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014768_7561216.pth [2024-12-13 05:48:00,371][03180] Fps is (10 sec: 1231.6, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7602176. Throughput: 0: 1103.2. Samples: 7601436. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:48:00,374][03180] Avg episode reward: [(0, '4741.825')] [2024-12-13 05:48:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7606272. Throughput: 0: 1119.8. Samples: 7607272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:48:05,371][03180] Avg episode reward: [(0, '4691.787')] [2024-12-13 05:48:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7610368. Throughput: 0: 1098.9. Samples: 7610152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:48:10,371][03180] Avg episode reward: [(0, '4684.682')] [2024-12-13 05:48:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014864_7610368.pth... [2024-12-13 05:48:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014800_7577600.pth [2024-12-13 05:48:14,792][03226] Updated weights for policy 0, policy_version 14880 (0.0019) [2024-12-13 05:48:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7618560. Throughput: 0: 1096.7. Samples: 7617752. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:48:15,371][03180] Avg episode reward: [(0, '4714.599')] [2024-12-13 05:48:20,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 7622656. Throughput: 0: 1118.8. Samples: 7624044. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:48:20,375][03180] Avg episode reward: [(0, '4640.279')] [2024-12-13 05:48:25,373][03180] Fps is (10 sec: 819.0, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 7626752. Throughput: 0: 1094.7. Samples: 7626664. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:48:25,374][03180] Avg episode reward: [(0, '4658.227')] [2024-12-13 05:48:25,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014896_7626752.pth... [2024-12-13 05:48:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014832_7593984.pth [2024-12-13 05:48:30,371][03180] Fps is (10 sec: 819.3, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 7630848. Throughput: 0: 1049.3. Samples: 7632264. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:48:30,371][03180] Avg episode reward: [(0, '4643.766')] [2024-12-13 05:48:35,371][03180] Fps is (10 sec: 819.4, 60 sec: 1024.0, 300 sec: 1083.0). Total num frames: 7634944. Throughput: 0: 1062.6. Samples: 7638388. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 05:48:35,371][03180] Avg episode reward: [(0, '4727.489')] [2024-12-13 05:48:40,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1096.9). Total num frames: 7643136. Throughput: 0: 1043.9. Samples: 7640808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:48:40,374][03180] Avg episode reward: [(0, '4636.450')] [2024-12-13 05:48:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014928_7643136.pth... [2024-12-13 05:48:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014864_7610368.pth [2024-12-13 05:48:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 7647232. Throughput: 0: 1037.6. Samples: 7648128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:48:45,371][03180] Avg episode reward: [(0, '4741.277')] [2024-12-13 05:48:50,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.7, 300 sec: 1096.9). Total num frames: 7655424. Throughput: 0: 1067.8. Samples: 7655324. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:48:50,371][03180] Avg episode reward: [(0, '4787.210')] [2024-12-13 05:48:54,805][03226] Updated weights for policy 0, policy_version 14960 (0.0013) [2024-12-13 05:48:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7659520. Throughput: 0: 1058.6. Samples: 7657788. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:48:55,371][03180] Avg episode reward: [(0, '4783.524')] [2024-12-13 05:48:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014960_7659520.pth... [2024-12-13 05:48:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014896_7626752.pth [2024-12-13 05:49:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 7663616. Throughput: 0: 1049.5. Samples: 7664980. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:00,371][03180] Avg episode reward: [(0, '4788.256')] [2024-12-13 05:49:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7671808. Throughput: 0: 1068.2. Samples: 7672112. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:05,371][03180] Avg episode reward: [(0, '4751.438')] [2024-12-13 05:49:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7675904. Throughput: 0: 1068.1. Samples: 7674724. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:10,371][03180] Avg episode reward: [(0, '4786.008')] [2024-12-13 05:49:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000014992_7675904.pth... [2024-12-13 05:49:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014928_7643136.pth [2024-12-13 05:49:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 7680000. Throughput: 0: 1095.2. Samples: 7681548. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:15,371][03180] Avg episode reward: [(0, '4767.572')] [2024-12-13 05:49:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7688192. Throughput: 0: 1125.2. Samples: 7689024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:20,371][03180] Avg episode reward: [(0, '4835.877')] [2024-12-13 05:49:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7692288. Throughput: 0: 1130.7. Samples: 7691688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:25,373][03180] Avg episode reward: [(0, '4797.923')] [2024-12-13 05:49:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015024_7692288.pth... [2024-12-13 05:49:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014960_7659520.pth [2024-12-13 05:49:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7696384. Throughput: 0: 1113.5. Samples: 7698236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:30,371][03180] Avg episode reward: [(0, '4839.864')] [2024-12-13 05:49:30,744][03226] Updated weights for policy 0, policy_version 15040 (0.0009) [2024-12-13 05:49:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7704576. Throughput: 0: 1122.8. Samples: 7705848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:35,371][03180] Avg episode reward: [(0, '4831.597')] [2024-12-13 05:49:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7708672. Throughput: 0: 1130.8. Samples: 7708672. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:40,371][03180] Avg episode reward: [(0, '4852.818')] [2024-12-13 05:49:40,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015056_7708672.pth... [2024-12-13 05:49:40,405][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000014992_7675904.pth [2024-12-13 05:49:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7712768. Throughput: 0: 1109.3. Samples: 7714900. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:45,371][03180] Avg episode reward: [(0, '4816.041')] [2024-12-13 05:49:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7720960. Throughput: 0: 1124.5. Samples: 7722716. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:50,371][03180] Avg episode reward: [(0, '4841.815')] [2024-12-13 05:49:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7725056. Throughput: 0: 1136.4. Samples: 7725864. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:49:55,371][03180] Avg episode reward: [(0, '4782.647')] [2024-12-13 05:49:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015088_7725056.pth... [2024-12-13 05:49:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015024_7692288.pth [2024-12-13 05:50:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7733248. Throughput: 0: 1121.5. Samples: 7732016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:00,372][03180] Avg episode reward: [(0, '4826.646')] [2024-12-13 05:50:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7737344. Throughput: 0: 1127.5. Samples: 7739760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:05,371][03180] Avg episode reward: [(0, '4880.643')] [2024-12-13 05:50:06,411][03226] Updated weights for policy 0, policy_version 15120 (0.0009) [2024-12-13 05:50:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7741440. Throughput: 0: 1143.4. Samples: 7743140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:10,374][03180] Avg episode reward: [(0, '4837.708')] [2024-12-13 05:50:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015120_7741440.pth... [2024-12-13 05:50:10,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015056_7708672.pth [2024-12-13 05:50:15,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7749632. Throughput: 0: 1124.8. Samples: 7748856. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:50:15,374][03180] Avg episode reward: [(0, '4862.695')] [2024-12-13 05:50:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7753728. Throughput: 0: 1129.2. Samples: 7756660. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:50:20,371][03180] Avg episode reward: [(0, '4871.033')] [2024-12-13 05:50:25,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7761920. Throughput: 0: 1151.1. Samples: 7760472. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:50:25,371][03180] Avg episode reward: [(0, '4861.184')] [2024-12-13 05:50:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015160_7761920.pth... [2024-12-13 05:50:25,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015088_7725056.pth [2024-12-13 05:50:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7766016. Throughput: 0: 1131.7. Samples: 7765828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:30,372][03180] Avg episode reward: [(0, '4845.136')] [2024-12-13 05:50:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7770112. Throughput: 0: 1129.1. Samples: 7773524. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:35,371][03180] Avg episode reward: [(0, '4901.430')] [2024-12-13 05:50:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7778304. Throughput: 0: 1142.0. Samples: 7777252. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:40,371][03180] Avg episode reward: [(0, '4848.686')] [2024-12-13 05:50:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015192_7778304.pth... [2024-12-13 05:50:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015120_7741440.pth [2024-12-13 05:50:43,922][03226] Updated weights for policy 0, policy_version 15200 (0.0012) [2024-12-13 05:50:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7782400. Throughput: 0: 1121.9. Samples: 7782500. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:45,371][03180] Avg episode reward: [(0, '4878.006')] [2024-12-13 05:50:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7786496. Throughput: 0: 1121.9. Samples: 7790244. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:50,371][03180] Avg episode reward: [(0, '4907.742')] [2024-12-13 05:50:55,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7794688. Throughput: 0: 1132.2. Samples: 7794092. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:50:55,374][03180] Avg episode reward: [(0, '4955.689')] [2024-12-13 05:50:55,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015224_7794688.pth... [2024-12-13 05:50:55,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015160_7761920.pth [2024-12-13 05:51:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7798784. Throughput: 0: 1129.5. Samples: 7799680. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:51:00,371][03180] Avg episode reward: [(0, '4988.801')] [2024-12-13 05:51:05,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7806976. Throughput: 0: 1119.6. Samples: 7807040. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:51:05,371][03180] Avg episode reward: [(0, '5035.426')] [2024-12-13 05:51:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7811072. Throughput: 0: 1122.2. Samples: 7810972. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:51:10,371][03180] Avg episode reward: [(0, '5054.475')] [2024-12-13 05:51:10,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015256_7811072.pth... [2024-12-13 05:51:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015192_7778304.pth [2024-12-13 05:51:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7815168. Throughput: 0: 1133.9. Samples: 7816852. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:51:15,371][03180] Avg episode reward: [(0, '5082.183')] [2024-12-13 05:51:19,575][03226] Updated weights for policy 0, policy_version 15280 (0.0009) [2024-12-13 05:51:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7823360. Throughput: 0: 1120.4. Samples: 7823944. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:51:20,371][03180] Avg episode reward: [(0, '5043.038')] [2024-12-13 05:51:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7827456. Throughput: 0: 1123.7. Samples: 7827820. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:51:25,372][03180] Avg episode reward: [(0, '5041.779')] [2024-12-13 05:51:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015288_7827456.pth... [2024-12-13 05:51:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015224_7794688.pth [2024-12-13 05:51:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7831552. Throughput: 0: 1149.4. Samples: 7834224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:51:30,371][03180] Avg episode reward: [(0, '5059.570')] [2024-12-13 05:51:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7839744. Throughput: 0: 1124.9. Samples: 7840864. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:51:35,371][03180] Avg episode reward: [(0, '5048.743')] [2024-12-13 05:51:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7843840. Throughput: 0: 1126.0. Samples: 7844760. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:51:40,372][03180] Avg episode reward: [(0, '5068.789')] [2024-12-13 05:51:40,435][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015328_7847936.pth... [2024-12-13 05:51:40,438][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015256_7811072.pth [2024-12-13 05:51:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7852032. Throughput: 0: 1151.6. Samples: 7851500. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:51:45,371][03180] Avg episode reward: [(0, '5076.437')] [2024-12-13 05:51:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7856128. Throughput: 0: 1129.1. Samples: 7857848. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:51:50,371][03180] Avg episode reward: [(0, '5071.480')] [2024-12-13 05:51:54,983][03226] Updated weights for policy 0, policy_version 15360 (0.0009) [2024-12-13 05:51:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 7864320. Throughput: 0: 1128.4. Samples: 7861752. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:51:55,371][03180] Avg episode reward: [(0, '4978.185')] [2024-12-13 05:51:55,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015360_7864320.pth... [2024-12-13 05:51:55,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015288_7827456.pth [2024-12-13 05:52:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7868416. Throughput: 0: 1153.4. Samples: 7868756. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:52:00,371][03180] Avg episode reward: [(0, '5028.967')] [2024-12-13 05:52:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7872512. Throughput: 0: 1128.9. Samples: 7874744. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:52:05,371][03180] Avg episode reward: [(0, '5015.569')] [2024-12-13 05:52:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7880704. Throughput: 0: 1129.2. Samples: 7878632. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:52:10,371][03180] Avg episode reward: [(0, '5010.868')] [2024-12-13 05:52:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015392_7880704.pth... [2024-12-13 05:52:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015328_7847936.pth [2024-12-13 05:52:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7884800. Throughput: 0: 1148.1. Samples: 7885888. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:52:15,371][03180] Avg episode reward: [(0, '4977.930')] [2024-12-13 05:52:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7888896. Throughput: 0: 1128.1. Samples: 7891628. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:52:20,371][03180] Avg episode reward: [(0, '4962.587')] [2024-12-13 05:52:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7897088. Throughput: 0: 1128.2. Samples: 7895528. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:52:25,375][03180] Avg episode reward: [(0, '4937.787')] [2024-12-13 05:52:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015424_7897088.pth... [2024-12-13 05:52:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015360_7864320.pth [2024-12-13 05:52:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 7901184. Throughput: 0: 1148.1. Samples: 7903164. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:52:30,371][03180] Avg episode reward: [(0, '5053.948')] [2024-12-13 05:52:31,783][03226] Updated weights for policy 0, policy_version 15440 (0.0009) [2024-12-13 05:52:35,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 7905280. Throughput: 0: 1105.2. Samples: 7907584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:52:35,373][03180] Avg episode reward: [(0, '5096.839')] [2024-12-13 05:52:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 7909376. Throughput: 0: 1068.8. Samples: 7909848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:52:40,371][03180] Avg episode reward: [(0, '5176.672')] [2024-12-13 05:52:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015448_7909376.pth... [2024-12-13 05:52:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015392_7880704.pth [2024-12-13 05:52:45,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.9). Total num frames: 7917568. Throughput: 0: 1084.3. Samples: 7917548. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:52:45,371][03180] Avg episode reward: [(0, '5083.394')] [2024-12-13 05:52:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7921664. Throughput: 0: 1089.2. Samples: 7923760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:52:50,373][03180] Avg episode reward: [(0, '5045.735')] [2024-12-13 05:52:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 7925760. Throughput: 0: 1067.8. Samples: 7926684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:52:55,371][03180] Avg episode reward: [(0, '4999.302')] [2024-12-13 05:52:55,428][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015488_7929856.pth... [2024-12-13 05:52:55,438][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015424_7897088.pth [2024-12-13 05:53:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7933952. Throughput: 0: 1081.2. Samples: 7934544. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:53:00,371][03180] Avg episode reward: [(0, '4996.292')] [2024-12-13 05:53:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7938048. Throughput: 0: 1101.6. Samples: 7941200. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:53:05,371][03180] Avg episode reward: [(0, '5001.430')] [2024-12-13 05:53:10,104][03226] Updated weights for policy 0, policy_version 15520 (0.0013) [2024-12-13 05:53:10,372][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 7946240. Throughput: 0: 1072.2. Samples: 7943780. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:53:10,372][03180] Avg episode reward: [(0, '4999.147')] [2024-12-13 05:53:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015520_7946240.pth... [2024-12-13 05:53:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015448_7909376.pth [2024-12-13 05:53:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7950336. Throughput: 0: 1069.9. Samples: 7951308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:53:15,371][03180] Avg episode reward: [(0, '4978.498')] [2024-12-13 05:53:20,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7958528. Throughput: 0: 1128.9. Samples: 7958384. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:53:20,371][03180] Avg episode reward: [(0, '5096.420')] [2024-12-13 05:53:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7962624. Throughput: 0: 1132.9. Samples: 7960828. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:53:25,371][03180] Avg episode reward: [(0, '5090.077')] [2024-12-13 05:53:25,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015552_7962624.pth... [2024-12-13 05:53:25,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015488_7929856.pth [2024-12-13 05:53:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 7966720. Throughput: 0: 1122.5. Samples: 7968060. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:53:30,372][03180] Avg episode reward: [(0, '4993.312')] [2024-12-13 05:53:35,375][03180] Fps is (10 sec: 1228.3, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7974912. Throughput: 0: 1144.8. Samples: 7975280. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:53:35,376][03180] Avg episode reward: [(0, '5062.303')] [2024-12-13 05:53:40,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7979008. Throughput: 0: 1137.8. Samples: 7977888. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:53:40,373][03180] Avg episode reward: [(0, '5100.107')] [2024-12-13 05:53:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015584_7979008.pth... [2024-12-13 05:53:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015520_7946240.pth [2024-12-13 05:53:45,371][03180] Fps is (10 sec: 819.6, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7983104. Throughput: 0: 1118.9. Samples: 7984896. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:53:45,371][03180] Avg episode reward: [(0, '5146.915')] [2024-12-13 05:53:45,863][03226] Updated weights for policy 0, policy_version 15600 (0.0009) [2024-12-13 05:53:50,374][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7991296. Throughput: 0: 1140.1. Samples: 7992508. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:53:50,375][03180] Avg episode reward: [(0, '5032.798')] [2024-12-13 05:53:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 7995392. Throughput: 0: 1141.5. Samples: 7995148. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:53:55,371][03180] Avg episode reward: [(0, '4967.644')] [2024-12-13 05:53:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015616_7995392.pth... [2024-12-13 05:53:55,394][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015552_7962624.pth [2024-12-13 05:54:00,371][03180] Fps is (10 sec: 819.5, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7999488. Throughput: 0: 1120.3. Samples: 8001720. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:54:00,371][03180] Avg episode reward: [(0, '4902.692')] [2024-12-13 05:54:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8007680. Throughput: 0: 1135.1. Samples: 8009464. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 05:54:05,371][03180] Avg episode reward: [(0, '4963.521')] [2024-12-13 05:54:10,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 8011776. Throughput: 0: 1140.2. Samples: 8012140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:54:10,373][03180] Avg episode reward: [(0, '4967.641')] [2024-12-13 05:54:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015648_8011776.pth... [2024-12-13 05:54:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015584_7979008.pth [2024-12-13 05:54:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8019968. Throughput: 0: 1123.1. Samples: 8018600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:54:15,371][03180] Avg episode reward: [(0, '4963.659')] [2024-12-13 05:54:20,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8024064. Throughput: 0: 1135.0. Samples: 8026352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:54:20,371][03180] Avg episode reward: [(0, '4957.915')] [2024-12-13 05:54:21,635][03226] Updated weights for policy 0, policy_version 15680 (0.0011) [2024-12-13 05:54:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8028160. Throughput: 0: 1144.8. Samples: 8029400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:54:25,371][03180] Avg episode reward: [(0, '4959.728')] [2024-12-13 05:54:25,391][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015680_8028160.pth... [2024-12-13 05:54:25,406][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015616_7995392.pth [2024-12-13 05:54:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8036352. Throughput: 0: 1124.9. Samples: 8035516. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:54:30,371][03180] Avg episode reward: [(0, '4929.099')] [2024-12-13 05:54:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8040448. Throughput: 0: 1127.0. Samples: 8043220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:54:35,371][03180] Avg episode reward: [(0, '4937.978')] [2024-12-13 05:54:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8044544. Throughput: 0: 1139.7. Samples: 8046436. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:54:40,371][03180] Avg episode reward: [(0, '4874.593')] [2024-12-13 05:54:40,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015712_8044544.pth... [2024-12-13 05:54:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015648_8011776.pth [2024-12-13 05:54:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8052736. Throughput: 0: 1122.3. Samples: 8052224. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:54:45,371][03180] Avg episode reward: [(0, '4872.681')] [2024-12-13 05:54:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8056832. Throughput: 0: 1125.0. Samples: 8060088. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:54:50,371][03180] Avg episode reward: [(0, '4874.237')] [2024-12-13 05:54:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8065024. Throughput: 0: 1143.1. Samples: 8063576. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:54:55,371][03180] Avg episode reward: [(0, '4961.008')] [2024-12-13 05:54:55,387][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015752_8065024.pth... [2024-12-13 05:54:55,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015680_8028160.pth [2024-12-13 05:54:59,078][03226] Updated weights for policy 0, policy_version 15760 (0.0011) [2024-12-13 05:55:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8069120. Throughput: 0: 1122.7. Samples: 8069120. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:55:00,371][03180] Avg episode reward: [(0, '4954.196')] [2024-12-13 05:55:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8073216. Throughput: 0: 1123.6. Samples: 8076912. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:55:05,371][03180] Avg episode reward: [(0, '4946.660')] [2024-12-13 05:55:10,376][03180] Fps is (10 sec: 1228.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8081408. Throughput: 0: 1141.2. Samples: 8080760. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:55:10,377][03180] Avg episode reward: [(0, '4949.730')] [2024-12-13 05:55:10,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015784_8081408.pth... [2024-12-13 05:55:10,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015712_8044544.pth [2024-12-13 05:55:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8085504. Throughput: 0: 1122.2. Samples: 8086016. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:55:15,371][03180] Avg episode reward: [(0, '5003.852')] [2024-12-13 05:55:20,371][03180] Fps is (10 sec: 1229.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8093696. Throughput: 0: 1121.7. Samples: 8093696. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:55:20,371][03180] Avg episode reward: [(0, '5101.974')] [2024-12-13 05:55:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8097792. Throughput: 0: 1137.1. Samples: 8097604. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:55:25,379][03180] Avg episode reward: [(0, '5104.803')] [2024-12-13 05:55:25,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015816_8097792.pth... [2024-12-13 05:55:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015752_8065024.pth [2024-12-13 05:55:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8101888. Throughput: 0: 1126.2. Samples: 8102904. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:55:30,371][03180] Avg episode reward: [(0, '5103.074')] [2024-12-13 05:55:34,828][03226] Updated weights for policy 0, policy_version 15840 (0.0009) [2024-12-13 05:55:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8110080. Throughput: 0: 1116.4. Samples: 8110324. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:55:35,371][03180] Avg episode reward: [(0, '5150.582')] [2024-12-13 05:55:40,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8114176. Throughput: 0: 1124.4. Samples: 8114176. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:55:40,374][03180] Avg episode reward: [(0, '5149.544')] [2024-12-13 05:55:40,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015848_8114176.pth... [2024-12-13 05:55:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015784_8081408.pth [2024-12-13 05:55:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8118272. Throughput: 0: 1130.0. Samples: 8119972. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:55:45,371][03180] Avg episode reward: [(0, '5150.470')] [2024-12-13 05:55:50,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8126464. Throughput: 0: 1114.8. Samples: 8127076. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:55:50,371][03180] Avg episode reward: [(0, '5140.068')] [2024-12-13 05:55:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8130560. Throughput: 0: 1115.0. Samples: 8130928. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:55:55,371][03180] Avg episode reward: [(0, '5146.494')] [2024-12-13 05:55:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015880_8130560.pth... [2024-12-13 05:55:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015816_8097792.pth [2024-12-13 05:56:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8134656. Throughput: 0: 1130.0. Samples: 8136864. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:00,371][03180] Avg episode reward: [(0, '5209.576')] [2024-12-13 05:56:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8142848. Throughput: 0: 1109.4. Samples: 8143620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:05,371][03180] Avg episode reward: [(0, '5212.080')] [2024-12-13 05:56:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 8146944. Throughput: 0: 1107.4. Samples: 8147436. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:10,372][03180] Avg episode reward: [(0, '5290.788')] [2024-12-13 05:56:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015912_8146944.pth... [2024-12-13 05:56:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015848_8114176.pth [2024-12-13 05:56:11,091][03226] Updated weights for policy 0, policy_version 15920 (0.0020) [2024-12-13 05:56:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8151040. Throughput: 0: 1126.8. Samples: 8153608. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:56:15,373][03180] Avg episode reward: [(0, '5265.022')] [2024-12-13 05:56:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8159232. Throughput: 0: 1107.7. Samples: 8160172. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 05:56:20,371][03180] Avg episode reward: [(0, '5264.473')] [2024-12-13 05:56:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8163328. Throughput: 0: 1108.8. Samples: 8164068. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:25,371][03180] Avg episode reward: [(0, '5208.840')] [2024-12-13 05:56:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015944_8163328.pth... [2024-12-13 05:56:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015880_8130560.pth [2024-12-13 05:56:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8171520. Throughput: 0: 1124.1. Samples: 8170560. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:30,374][03180] Avg episode reward: [(0, '5159.477')] [2024-12-13 05:56:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8175616. Throughput: 0: 1107.9. Samples: 8176932. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:35,371][03180] Avg episode reward: [(0, '5093.352')] [2024-12-13 05:56:40,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8179712. Throughput: 0: 1108.4. Samples: 8180804. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:40,371][03180] Avg episode reward: [(0, '5152.312')] [2024-12-13 05:56:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000015976_8179712.pth... [2024-12-13 05:56:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015912_8146944.pth [2024-12-13 05:56:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8183808. Throughput: 0: 1096.6. Samples: 8186212. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:45,375][03180] Avg episode reward: [(0, '5147.604')] [2024-12-13 05:56:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 8187904. Throughput: 0: 1053.1. Samples: 8191008. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:50,371][03180] Avg episode reward: [(0, '5215.714')] [2024-12-13 05:56:50,905][03226] Updated weights for policy 0, policy_version 16000 (0.0009) [2024-12-13 05:56:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8196096. Throughput: 0: 1051.6. Samples: 8194756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:56:55,371][03180] Avg episode reward: [(0, '5203.871')] [2024-12-13 05:56:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016008_8196096.pth... [2024-12-13 05:56:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015944_8163328.pth [2024-12-13 05:57:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8200192. Throughput: 0: 1083.4. Samples: 8202360. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:57:00,371][03180] Avg episode reward: [(0, '5165.816')] [2024-12-13 05:57:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 8204288. Throughput: 0: 1054.0. Samples: 8207604. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:57:05,371][03180] Avg episode reward: [(0, '5170.684')] [2024-12-13 05:57:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8212480. Throughput: 0: 1050.5. Samples: 8211340. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:57:10,371][03180] Avg episode reward: [(0, '5234.756')] [2024-12-13 05:57:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016040_8212480.pth... [2024-12-13 05:57:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000015976_8179712.pth [2024-12-13 05:57:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8216576. Throughput: 0: 1077.9. Samples: 8219064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:57:15,371][03180] Avg episode reward: [(0, '5184.752')] [2024-12-13 05:57:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8224768. Throughput: 0: 1060.8. Samples: 8224668. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:57:20,372][03180] Avg episode reward: [(0, '5201.137')] [2024-12-13 05:57:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8228864. Throughput: 0: 1054.9. Samples: 8228276. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:57:25,371][03180] Avg episode reward: [(0, '5210.704')] [2024-12-13 05:57:25,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016072_8228864.pth... [2024-12-13 05:57:25,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016008_8196096.pth [2024-12-13 05:57:26,846][03226] Updated weights for policy 0, policy_version 16080 (0.0012) [2024-12-13 05:57:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8237056. Throughput: 0: 1101.8. Samples: 8235792. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:57:30,371][03180] Avg episode reward: [(0, '5163.336')] [2024-12-13 05:57:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8241152. Throughput: 0: 1123.5. Samples: 8241564. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:57:35,371][03180] Avg episode reward: [(0, '5148.194')] [2024-12-13 05:57:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8245248. Throughput: 0: 1113.9. Samples: 8244880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 05:57:40,371][03180] Avg episode reward: [(0, '5124.662')] [2024-12-13 05:57:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016104_8245248.pth... [2024-12-13 05:57:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016040_8212480.pth [2024-12-13 05:57:45,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8253440. Throughput: 0: 1117.6. Samples: 8252656. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:57:45,374][03180] Avg episode reward: [(0, '5083.183')] [2024-12-13 05:57:50,374][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8257536. Throughput: 0: 1140.2. Samples: 8258916. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:57:50,374][03180] Avg episode reward: [(0, '5106.080')] [2024-12-13 05:57:55,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8261632. Throughput: 0: 1119.8. Samples: 8261732. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 05:57:55,371][03180] Avg episode reward: [(0, '5143.104')] [2024-12-13 05:57:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016136_8261632.pth... [2024-12-13 05:57:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016072_8228864.pth [2024-12-13 05:58:00,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8269824. Throughput: 0: 1124.1. Samples: 8269648. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:58:00,372][03180] Avg episode reward: [(0, '5142.134')] [2024-12-13 05:58:02,566][03226] Updated weights for policy 0, policy_version 16160 (0.0012) [2024-12-13 05:58:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 8273920. Throughput: 0: 1142.8. Samples: 8276092. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:58:05,371][03180] Avg episode reward: [(0, '5104.955')] [2024-12-13 05:58:10,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8278016. Throughput: 0: 1117.3. Samples: 8278556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:58:10,371][03180] Avg episode reward: [(0, '5085.543')] [2024-12-13 05:58:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016168_8278016.pth... [2024-12-13 05:58:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016104_8245248.pth [2024-12-13 05:58:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 8286208. Throughput: 0: 1120.8. Samples: 8286228. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:58:15,371][03180] Avg episode reward: [(0, '5122.370')] [2024-12-13 05:58:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8290304. Throughput: 0: 1147.4. Samples: 8293196. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:58:20,378][03180] Avg episode reward: [(0, '5167.485')] [2024-12-13 05:58:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8294400. Throughput: 0: 1127.0. Samples: 8295596. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 05:58:25,371][03180] Avg episode reward: [(0, '5167.485')] [2024-12-13 05:58:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016200_8294400.pth... [2024-12-13 05:58:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016136_8261632.pth [2024-12-13 05:58:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8302592. Throughput: 0: 1114.5. Samples: 8302808. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:58:30,372][03180] Avg episode reward: [(0, '4964.246')] [2024-12-13 05:58:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8306688. Throughput: 0: 1138.7. Samples: 8310156. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 05:58:35,371][03180] Avg episode reward: [(0, '5008.981')] [2024-12-13 05:58:40,365][03226] Updated weights for policy 0, policy_version 16240 (0.0017) [2024-12-13 05:58:40,375][03180] Fps is (10 sec: 1228.3, 60 sec: 1160.5, 300 sec: 1124.6). Total num frames: 8314880. Throughput: 0: 1126.6. Samples: 8312432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:58:40,376][03180] Avg episode reward: [(0, '5011.199')] [2024-12-13 05:58:40,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016240_8314880.pth... [2024-12-13 05:58:40,390][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016168_8278016.pth [2024-12-13 05:58:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8318976. Throughput: 0: 1110.2. Samples: 8319608. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:58:45,371][03180] Avg episode reward: [(0, '4979.580')] [2024-12-13 05:58:50,371][03180] Fps is (10 sec: 1229.3, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 8327168. Throughput: 0: 1134.3. Samples: 8327136. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:58:50,371][03180] Avg episode reward: [(0, '5002.301')] [2024-12-13 05:58:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8331264. Throughput: 0: 1137.4. Samples: 8329740. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:58:55,371][03180] Avg episode reward: [(0, '4985.070')] [2024-12-13 05:58:55,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016272_8331264.pth... [2024-12-13 05:58:55,380][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016200_8294400.pth [2024-12-13 05:59:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8335360. Throughput: 0: 1118.0. Samples: 8336536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:00,371][03180] Avg episode reward: [(0, '4947.174')] [2024-12-13 05:59:05,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8343552. Throughput: 0: 1135.7. Samples: 8344304. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:05,373][03180] Avg episode reward: [(0, '5041.995')] [2024-12-13 05:59:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 8347648. Throughput: 0: 1141.8. Samples: 8346976. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:10,371][03180] Avg episode reward: [(0, '5054.676')] [2024-12-13 05:59:10,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016304_8347648.pth... [2024-12-13 05:59:10,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016240_8314880.pth [2024-12-13 05:59:15,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8351744. Throughput: 0: 1125.2. Samples: 8353440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:15,372][03180] Avg episode reward: [(0, '5031.166')] [2024-12-13 05:59:15,927][03226] Updated weights for policy 0, policy_version 16320 (0.0008) [2024-12-13 05:59:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8359936. Throughput: 0: 1134.3. Samples: 8361200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:20,371][03180] Avg episode reward: [(0, '4997.074')] [2024-12-13 05:59:25,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 8364032. Throughput: 0: 1146.7. Samples: 8364032. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:25,373][03180] Avg episode reward: [(0, '4983.840')] [2024-12-13 05:59:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016336_8364032.pth... [2024-12-13 05:59:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016272_8331264.pth [2024-12-13 05:59:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8368128. Throughput: 0: 1123.6. Samples: 8370172. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:30,371][03180] Avg episode reward: [(0, '5007.407')] [2024-12-13 05:59:35,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8376320. Throughput: 0: 1129.4. Samples: 8377960. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:35,371][03180] Avg episode reward: [(0, '5018.602')] [2024-12-13 05:59:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8380416. Throughput: 0: 1138.9. Samples: 8380992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:40,378][03180] Avg episode reward: [(0, '5005.043')] [2024-12-13 05:59:40,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016368_8380416.pth... [2024-12-13 05:59:40,402][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016304_8347648.pth [2024-12-13 05:59:45,373][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8388608. Throughput: 0: 1122.4. Samples: 8387048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:45,373][03180] Avg episode reward: [(0, '5060.872')] [2024-12-13 05:59:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8392704. Throughput: 0: 1121.4. Samples: 8394764. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:50,371][03180] Avg episode reward: [(0, '5124.619')] [2024-12-13 05:59:51,673][03226] Updated weights for policy 0, policy_version 16400 (0.0013) [2024-12-13 05:59:55,373][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 8396800. Throughput: 0: 1136.6. Samples: 8398128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 05:59:55,374][03180] Avg episode reward: [(0, '5116.338')] [2024-12-13 05:59:55,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016400_8396800.pth... [2024-12-13 05:59:55,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016336_8364032.pth [2024-12-13 06:00:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8404992. Throughput: 0: 1118.6. Samples: 8403776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:00:00,371][03180] Avg episode reward: [(0, '5083.310')] [2024-12-13 06:00:05,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8409088. Throughput: 0: 1111.2. Samples: 8411204. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:00:05,371][03180] Avg episode reward: [(0, '5103.233')] [2024-12-13 06:00:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8413184. Throughput: 0: 1126.7. Samples: 8414732. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:00:10,371][03180] Avg episode reward: [(0, '5096.563')] [2024-12-13 06:00:10,384][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016432_8413184.pth... [2024-12-13 06:00:10,396][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016368_8380416.pth [2024-12-13 06:00:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 8421376. Throughput: 0: 1107.1. Samples: 8419992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:00:15,371][03180] Avg episode reward: [(0, '5141.313')] [2024-12-13 06:00:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8425472. Throughput: 0: 1106.9. Samples: 8427772. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:00:20,371][03180] Avg episode reward: [(0, '5061.559')] [2024-12-13 06:00:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8429568. Throughput: 0: 1125.2. Samples: 8431628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:00:25,371][03180] Avg episode reward: [(0, '5075.852')] [2024-12-13 06:00:25,424][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016472_8433664.pth... [2024-12-13 06:00:25,434][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016400_8396800.pth [2024-12-13 06:00:29,840][03226] Updated weights for policy 0, policy_version 16480 (0.0010) [2024-12-13 06:00:30,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 8437760. Throughput: 0: 1105.8. Samples: 8436808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:00:30,373][03180] Avg episode reward: [(0, '5013.196')] [2024-12-13 06:00:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8441856. Throughput: 0: 1104.2. Samples: 8444452. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:00:35,371][03180] Avg episode reward: [(0, '5026.681')] [2024-12-13 06:00:40,376][03180] Fps is (10 sec: 1228.4, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 8450048. Throughput: 0: 1114.1. Samples: 8448268. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:00:40,377][03180] Avg episode reward: [(0, '5077.730')] [2024-12-13 06:00:40,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016504_8450048.pth... [2024-12-13 06:00:40,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016432_8413184.pth [2024-12-13 06:00:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8454144. Throughput: 0: 1114.7. Samples: 8453936. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:00:45,371][03180] Avg episode reward: [(0, '5078.515')] [2024-12-13 06:00:50,371][03180] Fps is (10 sec: 819.7, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8458240. Throughput: 0: 1116.6. Samples: 8461452. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:00:50,371][03180] Avg episode reward: [(0, '5074.403')] [2024-12-13 06:00:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8462336. Throughput: 0: 1103.4. Samples: 8464384. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:00:55,371][03180] Avg episode reward: [(0, '5018.795')] [2024-12-13 06:00:55,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016528_8462336.pth... [2024-12-13 06:00:55,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016472_8433664.pth [2024-12-13 06:01:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 8466432. Throughput: 0: 1084.3. Samples: 8468784. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:01:00,380][03180] Avg episode reward: [(0, '5002.655')] [2024-12-13 06:01:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8474624. Throughput: 0: 1058.1. Samples: 8475388. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:01:05,371][03180] Avg episode reward: [(0, '5033.313')] [2024-12-13 06:01:07,682][03226] Updated weights for policy 0, policy_version 16560 (0.0011) [2024-12-13 06:01:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8478720. Throughput: 0: 1056.7. Samples: 8479180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:01:10,371][03180] Avg episode reward: [(0, '4980.754')] [2024-12-13 06:01:10,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016560_8478720.pth... [2024-12-13 06:01:10,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016504_8450048.pth [2024-12-13 06:01:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1096.9). Total num frames: 8482816. Throughput: 0: 1089.4. Samples: 8485828. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:01:15,371][03180] Avg episode reward: [(0, '4973.266')] [2024-12-13 06:01:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8491008. Throughput: 0: 1060.3. Samples: 8492164. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:01:20,371][03180] Avg episode reward: [(0, '5007.570')] [2024-12-13 06:01:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1096.9). Total num frames: 8495104. Throughput: 0: 1059.8. Samples: 8495952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:01:25,372][03180] Avg episode reward: [(0, '5062.024')] [2024-12-13 06:01:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016592_8495104.pth... [2024-12-13 06:01:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016528_8462336.pth [2024-12-13 06:01:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8503296. Throughput: 0: 1088.4. Samples: 8502916. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:01:30,371][03180] Avg episode reward: [(0, '5062.024')] [2024-12-13 06:01:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8507392. Throughput: 0: 1057.2. Samples: 8509028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:01:35,371][03180] Avg episode reward: [(0, '5028.905')] [2024-12-13 06:01:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 8515584. Throughput: 0: 1078.0. Samples: 8512896. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:01:40,371][03180] Avg episode reward: [(0, '5107.762')] [2024-12-13 06:01:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016632_8515584.pth... [2024-12-13 06:01:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016560_8478720.pth [2024-12-13 06:01:43,483][03226] Updated weights for policy 0, policy_version 16640 (0.0016) [2024-12-13 06:01:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8519680. Throughput: 0: 1141.6. Samples: 8520156. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:01:45,371][03180] Avg episode reward: [(0, '5111.490')] [2024-12-13 06:01:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8523776. Throughput: 0: 1128.4. Samples: 8526164. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:01:50,371][03180] Avg episode reward: [(0, '5144.451')] [2024-12-13 06:01:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8531968. Throughput: 0: 1131.3. Samples: 8530088. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:01:55,371][03180] Avg episode reward: [(0, '5176.871')] [2024-12-13 06:01:55,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016664_8531968.pth... [2024-12-13 06:01:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016592_8495104.pth [2024-12-13 06:02:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8536064. Throughput: 0: 1145.0. Samples: 8537352. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:02:00,371][03180] Avg episode reward: [(0, '5218.026')] [2024-12-13 06:02:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8540160. Throughput: 0: 1131.4. Samples: 8543076. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:02:05,371][03180] Avg episode reward: [(0, '5218.638')] [2024-12-13 06:02:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8548352. Throughput: 0: 1132.1. Samples: 8546896. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:02:10,372][03180] Avg episode reward: [(0, '5234.422')] [2024-12-13 06:02:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016696_8548352.pth... [2024-12-13 06:02:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016632_8515584.pth [2024-12-13 06:02:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 8552448. Throughput: 0: 1148.9. Samples: 8554616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:02:15,374][03180] Avg episode reward: [(0, '5126.415')] [2024-12-13 06:02:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8556544. Throughput: 0: 1131.9. Samples: 8559964. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:02:20,371][03180] Avg episode reward: [(0, '5141.170')] [2024-12-13 06:02:20,669][03226] Updated weights for policy 0, policy_version 16720 (0.0009) [2024-12-13 06:02:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1110.8). Total num frames: 8564736. Throughput: 0: 1131.3. Samples: 8563804. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:02:25,371][03180] Avg episode reward: [(0, '5220.418')] [2024-12-13 06:02:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016728_8564736.pth... [2024-12-13 06:02:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016664_8531968.pth [2024-12-13 06:02:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8568832. Throughput: 0: 1143.6. Samples: 8571616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:02:30,371][03180] Avg episode reward: [(0, '5251.786')] [2024-12-13 06:02:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8577024. Throughput: 0: 1126.8. Samples: 8576868. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:02:35,371][03180] Avg episode reward: [(0, '5208.650')] [2024-12-13 06:02:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8581120. Throughput: 0: 1124.4. Samples: 8580688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:02:40,371][03180] Avg episode reward: [(0, '5218.642')] [2024-12-13 06:02:40,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016760_8581120.pth... [2024-12-13 06:02:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016696_8548352.pth [2024-12-13 06:02:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8589312. Throughput: 0: 1137.1. Samples: 8588520. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:02:45,371][03180] Avg episode reward: [(0, '5218.727')] [2024-12-13 06:02:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8593408. Throughput: 0: 1129.3. Samples: 8593896. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:02:50,371][03180] Avg episode reward: [(0, '5180.886')] [2024-12-13 06:02:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8597504. Throughput: 0: 1125.9. Samples: 8597560. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:02:55,371][03180] Avg episode reward: [(0, '5042.175')] [2024-12-13 06:02:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016792_8597504.pth... [2024-12-13 06:02:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016728_8564736.pth [2024-12-13 06:02:56,160][03226] Updated weights for policy 0, policy_version 16800 (0.0010) [2024-12-13 06:03:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8605696. Throughput: 0: 1130.3. Samples: 8605480. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:03:00,371][03180] Avg episode reward: [(0, '5088.750')] [2024-12-13 06:03:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8609792. Throughput: 0: 1138.1. Samples: 8611180. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:03:05,371][03180] Avg episode reward: [(0, '5104.353')] [2024-12-13 06:03:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8613888. Throughput: 0: 1124.1. Samples: 8614388. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:03:10,371][03180] Avg episode reward: [(0, '5102.589')] [2024-12-13 06:03:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016824_8613888.pth... [2024-12-13 06:03:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016760_8581120.pth [2024-12-13 06:03:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8622080. Throughput: 0: 1121.4. Samples: 8622080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:03:15,371][03180] Avg episode reward: [(0, '5092.869')] [2024-12-13 06:03:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8626176. Throughput: 0: 1139.4. Samples: 8628140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:03:20,373][03180] Avg episode reward: [(0, '5092.084')] [2024-12-13 06:03:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8630272. Throughput: 0: 1124.3. Samples: 8631280. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:03:25,372][03180] Avg episode reward: [(0, '5096.971')] [2024-12-13 06:03:25,402][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016864_8634368.pth... [2024-12-13 06:03:25,418][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016792_8597504.pth [2024-12-13 06:03:30,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8638464. Throughput: 0: 1121.9. Samples: 8639008. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:03:30,373][03180] Avg episode reward: [(0, '5163.933')] [2024-12-13 06:03:31,958][03226] Updated weights for policy 0, policy_version 16880 (0.0009) [2024-12-13 06:03:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8642560. Throughput: 0: 1142.9. Samples: 8645328. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:03:35,374][03180] Avg episode reward: [(0, '5153.151')] [2024-12-13 06:03:40,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8650752. Throughput: 0: 1122.7. Samples: 8648080. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:03:40,371][03180] Avg episode reward: [(0, '5144.701')] [2024-12-13 06:03:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016896_8650752.pth... [2024-12-13 06:03:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016824_8613888.pth [2024-12-13 06:03:45,378][03180] Fps is (10 sec: 1227.9, 60 sec: 1092.1, 300 sec: 1110.8). Total num frames: 8654848. Throughput: 0: 1113.3. Samples: 8655588. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:03:45,379][03180] Avg episode reward: [(0, '5148.097')] [2024-12-13 06:03:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8658944. Throughput: 0: 1136.1. Samples: 8662304. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:03:50,371][03180] Avg episode reward: [(0, '5092.668')] [2024-12-13 06:03:55,371][03180] Fps is (10 sec: 1229.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8667136. Throughput: 0: 1118.0. Samples: 8664700. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:03:55,371][03180] Avg episode reward: [(0, '5114.322')] [2024-12-13 06:03:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016928_8667136.pth... [2024-12-13 06:03:55,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016864_8634368.pth [2024-12-13 06:04:00,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 8671232. Throughput: 0: 1116.2. Samples: 8672312. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:04:00,376][03180] Avg episode reward: [(0, '5110.785')] [2024-12-13 06:04:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8679424. Throughput: 0: 1136.4. Samples: 8679276. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:04:05,376][03180] Avg episode reward: [(0, '5130.966')] [2024-12-13 06:04:09,791][03226] Updated weights for policy 0, policy_version 16960 (0.0009) [2024-12-13 06:04:10,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8683520. Throughput: 0: 1121.8. Samples: 8681760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:04:10,371][03180] Avg episode reward: [(0, '5135.898')] [2024-12-13 06:04:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016960_8683520.pth... [2024-12-13 06:04:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016896_8650752.pth [2024-12-13 06:04:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8687616. Throughput: 0: 1107.6. Samples: 8688848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:04:15,371][03180] Avg episode reward: [(0, '5239.779')] [2024-12-13 06:04:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8695808. Throughput: 0: 1126.6. Samples: 8696024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:04:20,371][03180] Avg episode reward: [(0, '5292.035')] [2024-12-13 06:04:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8699904. Throughput: 0: 1124.5. Samples: 8698684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:04:25,371][03180] Avg episode reward: [(0, '5347.677')] [2024-12-13 06:04:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000016992_8699904.pth... [2024-12-13 06:04:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016928_8667136.pth [2024-12-13 06:04:25,384][03213] Saving new best policy, reward=5347.677! [2024-12-13 06:04:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8704000. Throughput: 0: 1114.6. Samples: 8705736. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:04:30,371][03180] Avg episode reward: [(0, '5308.139')] [2024-12-13 06:04:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8712192. Throughput: 0: 1134.6. Samples: 8713360. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:04:35,371][03180] Avg episode reward: [(0, '5307.661')] [2024-12-13 06:04:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8716288. Throughput: 0: 1138.9. Samples: 8715952. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:04:40,371][03180] Avg episode reward: [(0, '5328.608')] [2024-12-13 06:04:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017024_8716288.pth... [2024-12-13 06:04:40,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016960_8683520.pth [2024-12-13 06:04:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 8720384. Throughput: 0: 1120.5. Samples: 8722728. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:04:45,372][03180] Avg episode reward: [(0, '5340.725')] [2024-12-13 06:04:45,404][03226] Updated weights for policy 0, policy_version 17040 (0.0009) [2024-12-13 06:04:50,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.4, 300 sec: 1124.7). Total num frames: 8728576. Throughput: 0: 1138.5. Samples: 8730512. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:04:50,376][03180] Avg episode reward: [(0, '5350.113')] [2024-12-13 06:04:50,377][03213] Saving new best policy, reward=5350.113! [2024-12-13 06:04:55,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 8732672. Throughput: 0: 1137.7. Samples: 8732960. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:04:55,376][03180] Avg episode reward: [(0, '5350.329')] [2024-12-13 06:04:55,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017056_8732672.pth... [2024-12-13 06:04:55,404][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000016992_8699904.pth [2024-12-13 06:04:55,404][03213] Saving new best policy, reward=5350.329! [2024-12-13 06:05:00,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 8740864. Throughput: 0: 1131.2. Samples: 8739752. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:05:00,371][03180] Avg episode reward: [(0, '5351.486')] [2024-12-13 06:05:00,375][03213] Saving new best policy, reward=5351.486! [2024-12-13 06:05:05,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8744960. Throughput: 0: 1097.4. Samples: 8745408. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:05:05,371][03180] Avg episode reward: [(0, '5390.535')] [2024-12-13 06:05:05,377][03213] Saving new best policy, reward=5390.535! [2024-12-13 06:05:10,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1110.8). Total num frames: 8749056. Throughput: 0: 1091.6. Samples: 8747808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:05:10,373][03180] Avg episode reward: [(0, '5468.157')] [2024-12-13 06:05:10,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017088_8749056.pth... [2024-12-13 06:05:10,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017024_8716288.pth [2024-12-13 06:05:10,389][03213] Saving new best policy, reward=5468.157! [2024-12-13 06:05:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8753152. Throughput: 0: 1068.3. Samples: 8753808. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:05:15,371][03180] Avg episode reward: [(0, '5545.984')] [2024-12-13 06:05:15,372][03213] Saving new best policy, reward=5545.984! [2024-12-13 06:05:20,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8761344. Throughput: 0: 1074.4. Samples: 8761708. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:05:20,371][03180] Avg episode reward: [(0, '5546.896')] [2024-12-13 06:05:20,372][03213] Saving new best policy, reward=5546.896! [2024-12-13 06:05:23,331][03226] Updated weights for policy 0, policy_version 17120 (0.0008) [2024-12-13 06:05:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8765440. Throughput: 0: 1096.0. Samples: 8765272. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:05:25,371][03180] Avg episode reward: [(0, '5539.811')] [2024-12-13 06:05:25,386][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017120_8765440.pth... [2024-12-13 06:05:25,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017056_8732672.pth [2024-12-13 06:05:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8769536. Throughput: 0: 1072.8. Samples: 8771004. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:05:30,371][03180] Avg episode reward: [(0, '5585.847')] [2024-12-13 06:05:30,372][03213] Saving new best policy, reward=5585.847! [2024-12-13 06:05:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8777728. Throughput: 0: 1076.4. Samples: 8778944. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:05:35,372][03180] Avg episode reward: [(0, '5568.796')] [2024-12-13 06:05:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8781824. Throughput: 0: 1105.3. Samples: 8782692. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:05:40,371][03180] Avg episode reward: [(0, '5544.647')] [2024-12-13 06:05:40,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017152_8781824.pth... [2024-12-13 06:05:40,396][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017088_8749056.pth [2024-12-13 06:05:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8785920. Throughput: 0: 1075.4. Samples: 8788144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:05:45,371][03180] Avg episode reward: [(0, '5549.642')] [2024-12-13 06:05:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 8794112. Throughput: 0: 1125.4. Samples: 8796052. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:05:50,371][03180] Avg episode reward: [(0, '5551.880')] [2024-12-13 06:05:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8798208. Throughput: 0: 1158.4. Samples: 8799936. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:05:55,371][03180] Avg episode reward: [(0, '5556.249')] [2024-12-13 06:05:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017184_8798208.pth... [2024-12-13 06:05:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017120_8765440.pth [2024-12-13 06:05:59,978][03226] Updated weights for policy 0, policy_version 17200 (0.0010) [2024-12-13 06:06:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8806400. Throughput: 0: 1142.0. Samples: 8805196. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:06:00,371][03180] Avg episode reward: [(0, '5589.240')] [2024-12-13 06:06:00,372][03213] Saving new best policy, reward=5589.240! [2024-12-13 06:06:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8810496. Throughput: 0: 1142.1. Samples: 8813104. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:06:05,371][03180] Avg episode reward: [(0, '5598.770')] [2024-12-13 06:06:05,372][03213] Saving new best policy, reward=5598.770! [2024-12-13 06:06:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 8818688. Throughput: 0: 1148.1. Samples: 8816936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:10,371][03180] Avg episode reward: [(0, '5597.942')] [2024-12-13 06:06:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017224_8818688.pth... [2024-12-13 06:06:10,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017152_8781824.pth [2024-12-13 06:06:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8822784. Throughput: 0: 1142.8. Samples: 8822428. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:15,371][03180] Avg episode reward: [(0, '5601.051')] [2024-12-13 06:06:15,372][03213] Saving new best policy, reward=5601.051! [2024-12-13 06:06:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8826880. Throughput: 0: 1136.4. Samples: 8830080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:20,371][03180] Avg episode reward: [(0, '5499.521')] [2024-12-13 06:06:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8835072. Throughput: 0: 1141.3. Samples: 8834052. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:25,371][03180] Avg episode reward: [(0, '5491.702')] [2024-12-13 06:06:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017256_8835072.pth... [2024-12-13 06:06:25,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017184_8798208.pth [2024-12-13 06:06:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8839168. Throughput: 0: 1144.3. Samples: 8839636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:30,371][03180] Avg episode reward: [(0, '5498.006')] [2024-12-13 06:06:35,166][03226] Updated weights for policy 0, policy_version 17280 (0.0013) [2024-12-13 06:06:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8847360. Throughput: 0: 1138.0. Samples: 8847264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:35,371][03180] Avg episode reward: [(0, '5498.435')] [2024-12-13 06:06:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8851456. Throughput: 0: 1139.0. Samples: 8851192. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:40,377][03180] Avg episode reward: [(0, '5482.475')] [2024-12-13 06:06:40,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017288_8851456.pth... [2024-12-13 06:06:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017224_8818688.pth [2024-12-13 06:06:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8855552. Throughput: 0: 1150.5. Samples: 8856968. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:45,371][03180] Avg episode reward: [(0, '5444.237')] [2024-12-13 06:06:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8863744. Throughput: 0: 1134.1. Samples: 8864140. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:50,371][03180] Avg episode reward: [(0, '5449.217')] [2024-12-13 06:06:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8867840. Throughput: 0: 1132.7. Samples: 8867908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:06:55,371][03180] Avg episode reward: [(0, '5485.569')] [2024-12-13 06:06:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017320_8867840.pth... [2024-12-13 06:06:55,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017256_8835072.pth [2024-12-13 06:07:00,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 8871936. Throughput: 0: 1146.5. Samples: 8874024. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:07:00,373][03180] Avg episode reward: [(0, '5486.232')] [2024-12-13 06:07:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8880128. Throughput: 0: 1132.5. Samples: 8881044. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:07:05,371][03180] Avg episode reward: [(0, '5485.780')] [2024-12-13 06:07:10,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8884224. Throughput: 0: 1128.6. Samples: 8884840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:07:10,371][03180] Avg episode reward: [(0, '5456.389')] [2024-12-13 06:07:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017352_8884224.pth... [2024-12-13 06:07:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017288_8851456.pth [2024-12-13 06:07:10,745][03226] Updated weights for policy 0, policy_version 17360 (0.0009) [2024-12-13 06:07:15,372][03180] Fps is (10 sec: 819.1, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 8888320. Throughput: 0: 1149.0. Samples: 8891344. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:07:15,373][03180] Avg episode reward: [(0, '5319.567')] [2024-12-13 06:07:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8896512. Throughput: 0: 1127.2. Samples: 8897988. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:07:20,371][03180] Avg episode reward: [(0, '5323.169')] [2024-12-13 06:07:25,371][03180] Fps is (10 sec: 1638.7, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 8904704. Throughput: 0: 1126.3. Samples: 8901876. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:07:25,371][03180] Avg episode reward: [(0, '5321.306')] [2024-12-13 06:07:25,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017392_8904704.pth... [2024-12-13 06:07:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017320_8867840.pth [2024-12-13 06:07:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8908800. Throughput: 0: 1151.0. Samples: 8908764. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:07:30,371][03180] Avg episode reward: [(0, '5322.573')] [2024-12-13 06:07:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8912896. Throughput: 0: 1133.4. Samples: 8915144. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:07:35,371][03180] Avg episode reward: [(0, '5321.909')] [2024-12-13 06:07:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8921088. Throughput: 0: 1136.9. Samples: 8919068. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:07:40,371][03180] Avg episode reward: [(0, '5360.255')] [2024-12-13 06:07:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017424_8921088.pth... [2024-12-13 06:07:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017352_8884224.pth [2024-12-13 06:07:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8925184. Throughput: 0: 1153.4. Samples: 8925924. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:07:45,371][03180] Avg episode reward: [(0, '5407.882')] [2024-12-13 06:07:47,858][03226] Updated weights for policy 0, policy_version 17440 (0.0009) [2024-12-13 06:07:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8929280. Throughput: 0: 1135.8. Samples: 8932156. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:07:50,371][03180] Avg episode reward: [(0, '5418.761')] [2024-12-13 06:07:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8937472. Throughput: 0: 1137.6. Samples: 8936032. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:07:55,372][03180] Avg episode reward: [(0, '5414.845')] [2024-12-13 06:07:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017456_8937472.pth... [2024-12-13 06:07:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017392_8904704.pth [2024-12-13 06:08:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 8941568. Throughput: 0: 1155.9. Samples: 8943356. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:08:00,371][03180] Avg episode reward: [(0, '5417.923')] [2024-12-13 06:08:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8945664. Throughput: 0: 1140.4. Samples: 8949304. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:08:05,371][03180] Avg episode reward: [(0, '5348.737')] [2024-12-13 06:08:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8953856. Throughput: 0: 1140.3. Samples: 8953188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:08:10,371][03180] Avg episode reward: [(0, '5327.362')] [2024-12-13 06:08:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017488_8953856.pth... [2024-12-13 06:08:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017424_8921088.pth [2024-12-13 06:08:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 8957952. Throughput: 0: 1152.6. Samples: 8960632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:08:15,371][03180] Avg episode reward: [(0, '5344.312')] [2024-12-13 06:08:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 8966144. Throughput: 0: 1134.9. Samples: 8966216. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:08:20,371][03180] Avg episode reward: [(0, '5334.346')] [2024-12-13 06:08:23,139][03226] Updated weights for policy 0, policy_version 17520 (0.0011) [2024-12-13 06:08:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8970240. Throughput: 0: 1135.4. Samples: 8970160. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:08:25,371][03180] Avg episode reward: [(0, '5336.945')] [2024-12-13 06:08:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017520_8970240.pth... [2024-12-13 06:08:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017456_8937472.pth [2024-12-13 06:08:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 8978432. Throughput: 0: 1159.1. Samples: 8978084. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:08:30,375][03180] Avg episode reward: [(0, '5365.470')] [2024-12-13 06:08:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8982528. Throughput: 0: 1136.0. Samples: 8983276. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:08:35,371][03180] Avg episode reward: [(0, '5402.517')] [2024-12-13 06:08:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 8986624. Throughput: 0: 1137.2. Samples: 8987208. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:08:40,371][03180] Avg episode reward: [(0, '5459.096')] [2024-12-13 06:08:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017552_8986624.pth... [2024-12-13 06:08:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017488_8953856.pth [2024-12-13 06:08:45,376][03180] Fps is (10 sec: 1228.1, 60 sec: 1160.4, 300 sec: 1138.5). Total num frames: 8994816. Throughput: 0: 1145.5. Samples: 8994912. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:08:45,377][03180] Avg episode reward: [(0, '5573.358')] [2024-12-13 06:08:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 8998912. Throughput: 0: 1134.8. Samples: 9000372. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:08:50,371][03180] Avg episode reward: [(0, '5568.739')] [2024-12-13 06:08:55,373][03180] Fps is (10 sec: 1229.2, 60 sec: 1160.5, 300 sec: 1138.6). Total num frames: 9007104. Throughput: 0: 1134.2. Samples: 9004228. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:08:55,373][03180] Avg episode reward: [(0, '5578.946')] [2024-12-13 06:08:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017592_9007104.pth... [2024-12-13 06:08:55,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017520_8970240.pth [2024-12-13 06:08:58,474][03226] Updated weights for policy 0, policy_version 17600 (0.0009) [2024-12-13 06:09:00,374][03180] Fps is (10 sec: 1228.3, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9011200. Throughput: 0: 1139.3. Samples: 9011904. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:09:00,375][03180] Avg episode reward: [(0, '5594.174')] [2024-12-13 06:09:05,371][03180] Fps is (10 sec: 819.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9015296. Throughput: 0: 1142.0. Samples: 9017604. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:09:05,371][03180] Avg episode reward: [(0, '5536.158')] [2024-12-13 06:09:10,371][03180] Fps is (10 sec: 1229.3, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9023488. Throughput: 0: 1132.8. Samples: 9021136. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:09:10,371][03180] Avg episode reward: [(0, '5505.213')] [2024-12-13 06:09:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017624_9023488.pth... [2024-12-13 06:09:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017552_8986624.pth [2024-12-13 06:09:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9027584. Throughput: 0: 1114.1. Samples: 9028220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:09:15,371][03180] Avg episode reward: [(0, '5379.440')] [2024-12-13 06:09:20,374][03180] Fps is (10 sec: 818.9, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 9031680. Throughput: 0: 1097.1. Samples: 9032648. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:09:20,378][03180] Avg episode reward: [(0, '5355.422')] [2024-12-13 06:09:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9035776. Throughput: 0: 1068.6. Samples: 9035296. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:09:25,371][03180] Avg episode reward: [(0, '5354.674')] [2024-12-13 06:09:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017648_9035776.pth... [2024-12-13 06:09:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017592_9007104.pth [2024-12-13 06:09:30,371][03180] Fps is (10 sec: 1229.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9043968. Throughput: 0: 1070.4. Samples: 9043072. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:09:30,371][03180] Avg episode reward: [(0, '5444.784')] [2024-12-13 06:09:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9048064. Throughput: 0: 1102.6. Samples: 9049988. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:09:35,373][03180] Avg episode reward: [(0, '5505.356')] [2024-12-13 06:09:37,759][03226] Updated weights for policy 0, policy_version 17680 (0.0009) [2024-12-13 06:09:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9052160. Throughput: 0: 1070.4. Samples: 9052392. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:09:40,371][03180] Avg episode reward: [(0, '5478.579')] [2024-12-13 06:09:40,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017680_9052160.pth... [2024-12-13 06:09:40,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017624_9023488.pth [2024-12-13 06:09:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 9060352. Throughput: 0: 1070.9. Samples: 9060092. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:09:45,371][03180] Avg episode reward: [(0, '5480.680')] [2024-12-13 06:09:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9064448. Throughput: 0: 1101.5. Samples: 9067172. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:09:50,371][03180] Avg episode reward: [(0, '5513.615')] [2024-12-13 06:09:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 9068544. Throughput: 0: 1079.1. Samples: 9069696. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:09:55,371][03180] Avg episode reward: [(0, '5460.916')] [2024-12-13 06:09:55,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017712_9068544.pth... [2024-12-13 06:09:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017648_9035776.pth [2024-12-13 06:10:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9076736. Throughput: 0: 1085.5. Samples: 9077068. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:00,371][03180] Avg episode reward: [(0, '5455.373')] [2024-12-13 06:10:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9080832. Throughput: 0: 1153.2. Samples: 9084536. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:05,371][03180] Avg episode reward: [(0, '5399.611')] [2024-12-13 06:10:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9089024. Throughput: 0: 1151.6. Samples: 9087120. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:10:10,371][03180] Avg episode reward: [(0, '5404.387')] [2024-12-13 06:10:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017752_9089024.pth... [2024-12-13 06:10:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017680_9052160.pth [2024-12-13 06:10:13,362][03226] Updated weights for policy 0, policy_version 17760 (0.0010) [2024-12-13 06:10:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9093120. Throughput: 0: 1130.3. Samples: 9093936. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:10:15,371][03180] Avg episode reward: [(0, '5379.552')] [2024-12-13 06:10:20,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 9101312. Throughput: 0: 1149.1. Samples: 9101700. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:10:20,372][03180] Avg episode reward: [(0, '5345.814')] [2024-12-13 06:10:25,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9105408. Throughput: 0: 1156.1. Samples: 9104420. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:25,373][03180] Avg episode reward: [(0, '5346.927')] [2024-12-13 06:10:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017784_9105408.pth... [2024-12-13 06:10:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017712_9068544.pth [2024-12-13 06:10:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9109504. Throughput: 0: 1133.2. Samples: 9111084. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:30,371][03180] Avg episode reward: [(0, '5358.212')] [2024-12-13 06:10:35,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9117696. Throughput: 0: 1147.7. Samples: 9118820. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:35,371][03180] Avg episode reward: [(0, '5373.433')] [2024-12-13 06:10:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9121792. Throughput: 0: 1155.4. Samples: 9121688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:40,371][03180] Avg episode reward: [(0, '5365.420')] [2024-12-13 06:10:40,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017816_9121792.pth... [2024-12-13 06:10:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017752_9089024.pth [2024-12-13 06:10:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9125888. Throughput: 0: 1133.2. Samples: 9128060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:45,371][03180] Avg episode reward: [(0, '5354.341')] [2024-12-13 06:10:48,696][03226] Updated weights for policy 0, policy_version 17840 (0.0009) [2024-12-13 06:10:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9134080. Throughput: 0: 1140.2. Samples: 9135844. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:50,371][03180] Avg episode reward: [(0, '5459.169')] [2024-12-13 06:10:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9138176. Throughput: 0: 1152.7. Samples: 9138992. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:10:55,371][03180] Avg episode reward: [(0, '5413.766')] [2024-12-13 06:10:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017848_9138176.pth... [2024-12-13 06:10:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017784_9105408.pth [2024-12-13 06:11:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9146368. Throughput: 0: 1138.8. Samples: 9145180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:11:00,371][03180] Avg episode reward: [(0, '5408.954')] [2024-12-13 06:11:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9150464. Throughput: 0: 1140.5. Samples: 9153020. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:11:05,371][03180] Avg episode reward: [(0, '5407.263')] [2024-12-13 06:11:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9154560. Throughput: 0: 1154.3. Samples: 9156360. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:11:10,372][03180] Avg episode reward: [(0, '5320.985')] [2024-12-13 06:11:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017880_9154560.pth... [2024-12-13 06:11:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017816_9121792.pth [2024-12-13 06:11:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9162752. Throughput: 0: 1137.9. Samples: 9162288. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:11:15,371][03180] Avg episode reward: [(0, '5273.291')] [2024-12-13 06:11:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9166848. Throughput: 0: 1142.6. Samples: 9170236. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:11:20,371][03180] Avg episode reward: [(0, '5322.899')] [2024-12-13 06:11:24,530][03226] Updated weights for policy 0, policy_version 17920 (0.0009) [2024-12-13 06:11:25,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9175040. Throughput: 0: 1157.6. Samples: 9173784. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:11:25,376][03180] Avg episode reward: [(0, '5380.429')] [2024-12-13 06:11:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017920_9175040.pth... [2024-12-13 06:11:25,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017848_9138176.pth [2024-12-13 06:11:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9179136. Throughput: 0: 1139.5. Samples: 9179336. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:11:30,371][03180] Avg episode reward: [(0, '5359.766')] [2024-12-13 06:11:35,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9187328. Throughput: 0: 1144.1. Samples: 9187328. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:11:35,371][03180] Avg episode reward: [(0, '5356.098')] [2024-12-13 06:11:40,377][03180] Fps is (10 sec: 1228.0, 60 sec: 1160.4, 300 sec: 1138.5). Total num frames: 9191424. Throughput: 0: 1161.1. Samples: 9191248. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:11:40,378][03180] Avg episode reward: [(0, '5348.923')] [2024-12-13 06:11:40,388][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017952_9191424.pth... [2024-12-13 06:11:40,396][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017880_9154560.pth [2024-12-13 06:11:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9195520. Throughput: 0: 1140.6. Samples: 9196508. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:11:45,371][03180] Avg episode reward: [(0, '5378.748')] [2024-12-13 06:11:50,371][03180] Fps is (10 sec: 1229.6, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9203712. Throughput: 0: 1141.2. Samples: 9204372. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:11:50,371][03180] Avg episode reward: [(0, '5404.628')] [2024-12-13 06:11:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.6). Total num frames: 9207808. Throughput: 0: 1152.4. Samples: 9208216. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:11:55,371][03180] Avg episode reward: [(0, '5406.511')] [2024-12-13 06:11:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000017984_9207808.pth... [2024-12-13 06:11:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017920_9175040.pth [2024-12-13 06:12:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9211904. Throughput: 0: 1138.8. Samples: 9213532. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:12:00,371][03180] Avg episode reward: [(0, '5394.275')] [2024-12-13 06:12:01,126][03226] Updated weights for policy 0, policy_version 18000 (0.0010) [2024-12-13 06:12:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9220096. Throughput: 0: 1129.4. Samples: 9221060. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:12:05,371][03180] Avg episode reward: [(0, '5425.112')] [2024-12-13 06:12:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.6). Total num frames: 9224192. Throughput: 0: 1135.6. Samples: 9224880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:12:10,371][03180] Avg episode reward: [(0, '5471.728')] [2024-12-13 06:12:10,392][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018016_9224192.pth... [2024-12-13 06:12:10,397][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017952_9191424.pth [2024-12-13 06:12:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9228288. Throughput: 0: 1136.8. Samples: 9230492. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:12:15,371][03180] Avg episode reward: [(0, '5467.311')] [2024-12-13 06:12:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9236480. Throughput: 0: 1119.4. Samples: 9237700. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:12:20,372][03180] Avg episode reward: [(0, '5487.576')] [2024-12-13 06:12:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9240576. Throughput: 0: 1119.6. Samples: 9241624. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:12:25,371][03180] Avg episode reward: [(0, '5483.377')] [2024-12-13 06:12:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018048_9240576.pth... [2024-12-13 06:12:25,392][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000017984_9207808.pth [2024-12-13 06:12:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9244672. Throughput: 0: 1136.1. Samples: 9247632. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:12:30,371][03180] Avg episode reward: [(0, '5485.025')] [2024-12-13 06:12:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9252864. Throughput: 0: 1116.2. Samples: 9254600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:12:35,371][03180] Avg episode reward: [(0, '5484.565')] [2024-12-13 06:12:36,887][03226] Updated weights for policy 0, policy_version 18080 (0.0009) [2024-12-13 06:12:40,371][03180] Fps is (10 sec: 1638.4, 60 sec: 1160.7, 300 sec: 1138.5). Total num frames: 9261056. Throughput: 0: 1116.5. Samples: 9258460. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:12:40,371][03180] Avg episode reward: [(0, '5527.391')] [2024-12-13 06:12:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018088_9261056.pth... [2024-12-13 06:12:40,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018016_9224192.pth [2024-12-13 06:12:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9265152. Throughput: 0: 1136.1. Samples: 9264656. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:12:45,373][03180] Avg episode reward: [(0, '5494.831')] [2024-12-13 06:12:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9269248. Throughput: 0: 1115.6. Samples: 9271260. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:12:50,371][03180] Avg episode reward: [(0, '5500.691')] [2024-12-13 06:12:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9277440. Throughput: 0: 1118.8. Samples: 9275224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:12:55,372][03180] Avg episode reward: [(0, '5476.914')] [2024-12-13 06:12:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018120_9277440.pth... [2024-12-13 06:12:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018048_9240576.pth [2024-12-13 06:13:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9281536. Throughput: 0: 1143.5. Samples: 9281948. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:13:00,371][03180] Avg episode reward: [(0, '5483.219')] [2024-12-13 06:13:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9285632. Throughput: 0: 1125.4. Samples: 9288344. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:13:05,371][03180] Avg episode reward: [(0, '5482.123')] [2024-12-13 06:13:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9293824. Throughput: 0: 1124.4. Samples: 9292220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:13:10,372][03180] Avg episode reward: [(0, '5485.427')] [2024-12-13 06:13:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018152_9293824.pth... [2024-12-13 06:13:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018088_9261056.pth [2024-12-13 06:13:12,610][03226] Updated weights for policy 0, policy_version 18160 (0.0009) [2024-12-13 06:13:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9297920. Throughput: 0: 1145.5. Samples: 9299180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:13:15,376][03180] Avg episode reward: [(0, '5426.840')] [2024-12-13 06:13:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9302016. Throughput: 0: 1124.1. Samples: 9305184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:13:20,371][03180] Avg episode reward: [(0, '5426.476')] [2024-12-13 06:13:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9310208. Throughput: 0: 1124.5. Samples: 9309064. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:13:25,371][03180] Avg episode reward: [(0, '5434.239')] [2024-12-13 06:13:25,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018184_9310208.pth... [2024-12-13 06:13:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018120_9277440.pth [2024-12-13 06:13:30,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 9314304. Throughput: 0: 1121.0. Samples: 9315108. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:13:30,379][03180] Avg episode reward: [(0, '5437.978')] [2024-12-13 06:13:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9318400. Throughput: 0: 1074.0. Samples: 9319588. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:13:35,371][03180] Avg episode reward: [(0, '5440.703')] [2024-12-13 06:13:40,372][03180] Fps is (10 sec: 819.4, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 9322496. Throughput: 0: 1064.3. Samples: 9323120. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:13:40,373][03180] Avg episode reward: [(0, '5467.428')] [2024-12-13 06:13:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018208_9322496.pth... [2024-12-13 06:13:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018152_9293824.pth [2024-12-13 06:13:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9330688. Throughput: 0: 1083.1. Samples: 9330688. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:13:45,371][03180] Avg episode reward: [(0, '5454.046')] [2024-12-13 06:13:50,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9334784. Throughput: 0: 1069.5. Samples: 9336472. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:13:50,371][03180] Avg episode reward: [(0, '5470.317')] [2024-12-13 06:13:52,445][03226] Updated weights for policy 0, policy_version 18240 (0.0009) [2024-12-13 06:13:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 9338880. Throughput: 0: 1053.9. Samples: 9339644. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:13:55,371][03180] Avg episode reward: [(0, '5425.828')] [2024-12-13 06:13:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018240_9338880.pth... [2024-12-13 06:13:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018184_9310208.pth [2024-12-13 06:14:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9347072. Throughput: 0: 1068.4. Samples: 9347256. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:14:00,371][03180] Avg episode reward: [(0, '5439.254')] [2024-12-13 06:14:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9351168. Throughput: 0: 1075.1. Samples: 9353564. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:14:05,371][03180] Avg episode reward: [(0, '5521.064')] [2024-12-13 06:14:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1110.8). Total num frames: 9355264. Throughput: 0: 1052.2. Samples: 9356412. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:14:10,371][03180] Avg episode reward: [(0, '5457.231')] [2024-12-13 06:14:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018272_9355264.pth... [2024-12-13 06:14:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018208_9322496.pth [2024-12-13 06:14:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9363456. Throughput: 0: 1083.4. Samples: 9363856. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:14:15,372][03180] Avg episode reward: [(0, '5498.309')] [2024-12-13 06:14:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9367552. Throughput: 0: 1129.6. Samples: 9370420. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:14:20,371][03180] Avg episode reward: [(0, '5435.206')] [2024-12-13 06:14:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9375744. Throughput: 0: 1108.0. Samples: 9372980. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:14:25,371][03180] Avg episode reward: [(0, '5404.838')] [2024-12-13 06:14:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018312_9375744.pth... [2024-12-13 06:14:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018240_9338880.pth [2024-12-13 06:14:28,551][03226] Updated weights for policy 0, policy_version 18320 (0.0008) [2024-12-13 06:14:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9379840. Throughput: 0: 1105.2. Samples: 9380420. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:14:30,372][03180] Avg episode reward: [(0, '5409.450')] [2024-12-13 06:14:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9383936. Throughput: 0: 1131.0. Samples: 9387368. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:14:35,371][03180] Avg episode reward: [(0, '5358.578')] [2024-12-13 06:14:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 9392128. Throughput: 0: 1115.6. Samples: 9389848. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:14:40,371][03180] Avg episode reward: [(0, '5317.471')] [2024-12-13 06:14:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018344_9392128.pth... [2024-12-13 06:14:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018272_9355264.pth [2024-12-13 06:14:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9396224. Throughput: 0: 1111.7. Samples: 9397284. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:14:45,371][03180] Avg episode reward: [(0, '5380.884')] [2024-12-13 06:14:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9404416. Throughput: 0: 1130.0. Samples: 9404416. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:14:50,371][03180] Avg episode reward: [(0, '5335.671')] [2024-12-13 06:14:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9408512. Throughput: 0: 1125.7. Samples: 9407068. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:14:55,371][03180] Avg episode reward: [(0, '5328.384')] [2024-12-13 06:14:55,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018376_9408512.pth... [2024-12-13 06:14:55,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018312_9375744.pth [2024-12-13 06:15:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9412608. Throughput: 0: 1123.5. Samples: 9414412. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:15:00,371][03180] Avg episode reward: [(0, '5312.769')] [2024-12-13 06:15:04,029][03226] Updated weights for policy 0, policy_version 18400 (0.0012) [2024-12-13 06:15:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9420800. Throughput: 0: 1140.0. Samples: 9421720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:15:05,372][03180] Avg episode reward: [(0, '5328.626')] [2024-12-13 06:15:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9424896. Throughput: 0: 1142.4. Samples: 9424388. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:15:10,371][03180] Avg episode reward: [(0, '5301.779')] [2024-12-13 06:15:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018408_9424896.pth... [2024-12-13 06:15:10,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018344_9392128.pth [2024-12-13 06:15:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9428992. Throughput: 0: 1127.7. Samples: 9431168. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:15:15,371][03180] Avg episode reward: [(0, '5321.292')] [2024-12-13 06:15:20,377][03180] Fps is (10 sec: 1228.0, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 9437184. Throughput: 0: 1149.0. Samples: 9439080. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:15:20,378][03180] Avg episode reward: [(0, '5318.056')] [2024-12-13 06:15:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9441280. Throughput: 0: 1150.8. Samples: 9441636. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:15:25,371][03180] Avg episode reward: [(0, '5347.386')] [2024-12-13 06:15:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018440_9441280.pth... [2024-12-13 06:15:25,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018376_9408512.pth [2024-12-13 06:15:30,371][03180] Fps is (10 sec: 1229.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9449472. Throughput: 0: 1136.3. Samples: 9448416. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:15:30,371][03180] Avg episode reward: [(0, '5352.003')] [2024-12-13 06:15:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9453568. Throughput: 0: 1152.7. Samples: 9456288. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:15:35,371][03180] Avg episode reward: [(0, '5339.448')] [2024-12-13 06:15:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9457664. Throughput: 0: 1154.4. Samples: 9459016. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:15:40,371][03180] Avg episode reward: [(0, '5430.130')] [2024-12-13 06:15:40,382][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018472_9457664.pth... [2024-12-13 06:15:40,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018408_9424896.pth [2024-12-13 06:15:41,200][03226] Updated weights for policy 0, policy_version 18480 (0.0009) [2024-12-13 06:15:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9465856. Throughput: 0: 1134.3. Samples: 9465456. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:15:45,372][03180] Avg episode reward: [(0, '5428.070')] [2024-12-13 06:15:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9469952. Throughput: 0: 1145.3. Samples: 9473260. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:15:50,371][03180] Avg episode reward: [(0, '5448.428')] [2024-12-13 06:15:55,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1160.4, 300 sec: 1124.6). Total num frames: 9478144. Throughput: 0: 1153.5. Samples: 9476300. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:15:55,376][03180] Avg episode reward: [(0, '5467.934')] [2024-12-13 06:15:55,388][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018512_9478144.pth... [2024-12-13 06:15:55,402][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018440_9441280.pth [2024-12-13 06:16:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9482240. Throughput: 0: 1136.4. Samples: 9482304. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:16:00,371][03180] Avg episode reward: [(0, '5559.944')] [2024-12-13 06:16:05,371][03180] Fps is (10 sec: 819.6, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9486336. Throughput: 0: 1132.1. Samples: 9490016. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:16:05,372][03180] Avg episode reward: [(0, '5536.392')] [2024-12-13 06:16:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9494528. Throughput: 0: 1147.9. Samples: 9493292. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:16:10,372][03180] Avg episode reward: [(0, '5595.463')] [2024-12-13 06:16:10,385][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018544_9494528.pth... [2024-12-13 06:16:10,393][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018472_9457664.pth [2024-12-13 06:16:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9498624. Throughput: 0: 1123.2. Samples: 9498960. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:16:15,374][03180] Avg episode reward: [(0, '5516.869')] [2024-12-13 06:16:16,927][03226] Updated weights for policy 0, policy_version 18560 (0.0009) [2024-12-13 06:16:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.7, 300 sec: 1124.7). Total num frames: 9506816. Throughput: 0: 1123.2. Samples: 9506832. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:16:20,371][03180] Avg episode reward: [(0, '5529.720')] [2024-12-13 06:16:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9510912. Throughput: 0: 1147.7. Samples: 9510664. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:16:25,371][03180] Avg episode reward: [(0, '5537.075')] [2024-12-13 06:16:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018576_9510912.pth... [2024-12-13 06:16:25,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018512_9478144.pth [2024-12-13 06:16:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9515008. Throughput: 0: 1124.3. Samples: 9516048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:16:30,371][03180] Avg episode reward: [(0, '5499.618')] [2024-12-13 06:16:35,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9523200. Throughput: 0: 1124.9. Samples: 9523880. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:16:35,371][03180] Avg episode reward: [(0, '5473.658')] [2024-12-13 06:16:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9527296. Throughput: 0: 1145.4. Samples: 9527836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:16:40,371][03180] Avg episode reward: [(0, '5496.142')] [2024-12-13 06:16:40,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018608_9527296.pth... [2024-12-13 06:16:40,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018544_9494528.pth [2024-12-13 06:16:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9531392. Throughput: 0: 1125.4. Samples: 9532948. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:16:45,371][03180] Avg episode reward: [(0, '5544.971')] [2024-12-13 06:16:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9539584. Throughput: 0: 1126.9. Samples: 9540728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:16:50,371][03180] Avg episode reward: [(0, '5536.963')] [2024-12-13 06:16:52,298][03226] Updated weights for policy 0, policy_version 18640 (0.0010) [2024-12-13 06:16:55,377][03180] Fps is (10 sec: 1228.0, 60 sec: 1092.2, 300 sec: 1124.6). Total num frames: 9543680. Throughput: 0: 1141.2. Samples: 9544652. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:16:55,378][03180] Avg episode reward: [(0, '5545.620')] [2024-12-13 06:16:55,394][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018640_9543680.pth... [2024-12-13 06:16:55,399][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018576_9510912.pth [2024-12-13 06:17:00,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9547776. Throughput: 0: 1139.2. Samples: 9550224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:17:00,371][03180] Avg episode reward: [(0, '5523.907')] [2024-12-13 06:17:05,371][03180] Fps is (10 sec: 1229.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9555968. Throughput: 0: 1134.7. Samples: 9557892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:17:05,371][03180] Avg episode reward: [(0, '5504.966')] [2024-12-13 06:17:10,379][03180] Fps is (10 sec: 1637.0, 60 sec: 1160.4, 300 sec: 1138.5). Total num frames: 9564160. Throughput: 0: 1138.0. Samples: 9561884. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:17:10,380][03180] Avg episode reward: [(0, '5559.400')] [2024-12-13 06:17:10,394][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018680_9564160.pth... [2024-12-13 06:17:10,406][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018608_9527296.pth [2024-12-13 06:17:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9568256. Throughput: 0: 1148.4. Samples: 9567724. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:17:15,371][03180] Avg episode reward: [(0, '5558.489')] [2024-12-13 06:17:20,371][03180] Fps is (10 sec: 819.9, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9572352. Throughput: 0: 1136.5. Samples: 9575024. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:17:20,371][03180] Avg episode reward: [(0, '5505.399')] [2024-12-13 06:17:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9580544. Throughput: 0: 1137.2. Samples: 9579012. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:17:25,372][03180] Avg episode reward: [(0, '5521.519')] [2024-12-13 06:17:25,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018712_9580544.pth... [2024-12-13 06:17:25,389][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018640_9543680.pth [2024-12-13 06:17:28,868][03226] Updated weights for policy 0, policy_version 18720 (0.0010) [2024-12-13 06:17:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9584640. Throughput: 0: 1151.9. Samples: 9584784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:17:30,371][03180] Avg episode reward: [(0, '5495.576')] [2024-12-13 06:17:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9588736. Throughput: 0: 1142.6. Samples: 9592144. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:17:35,371][03180] Avg episode reward: [(0, '5514.461')] [2024-12-13 06:17:40,372][03180] Fps is (10 sec: 1228.7, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9596928. Throughput: 0: 1143.1. Samples: 9596084. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:17:40,372][03180] Avg episode reward: [(0, '5574.983')] [2024-12-13 06:17:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018744_9596928.pth... [2024-12-13 06:17:40,388][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018680_9564160.pth [2024-12-13 06:17:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9601024. Throughput: 0: 1142.8. Samples: 9601648. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:17:45,371][03180] Avg episode reward: [(0, '5579.379')] [2024-12-13 06:17:50,371][03180] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9605120. Throughput: 0: 1074.7. Samples: 9606252. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:17:50,371][03180] Avg episode reward: [(0, '5582.370')] [2024-12-13 06:17:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 9609216. Throughput: 0: 1074.1. Samples: 9610208. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:17:55,371][03180] Avg episode reward: [(0, '5612.366')] [2024-12-13 06:17:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018768_9609216.pth... [2024-12-13 06:17:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018712_9580544.pth [2024-12-13 06:17:55,387][03213] Saving new best policy, reward=5612.366! [2024-12-13 06:18:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9617408. Throughput: 0: 1112.8. Samples: 9617800. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:18:00,371][03180] Avg episode reward: [(0, '5645.765')] [2024-12-13 06:18:00,376][03213] Saving new best policy, reward=5645.765! [2024-12-13 06:18:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9621504. Throughput: 0: 1072.0. Samples: 9623264. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:18:05,371][03180] Avg episode reward: [(0, '5648.721')] [2024-12-13 06:18:05,372][03213] Saving new best policy, reward=5648.721! [2024-12-13 06:18:06,802][03226] Updated weights for policy 0, policy_version 18800 (0.0009) [2024-12-13 06:18:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.4, 300 sec: 1124.7). Total num frames: 9629696. Throughput: 0: 1072.2. Samples: 9627260. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:18:10,372][03180] Avg episode reward: [(0, '5655.640')] [2024-12-13 06:18:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018808_9629696.pth... [2024-12-13 06:18:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018744_9596928.pth [2024-12-13 06:18:10,384][03213] Saving new best policy, reward=5655.640! [2024-12-13 06:18:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9633792. Throughput: 0: 1115.6. Samples: 9634984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:18:15,371][03180] Avg episode reward: [(0, '5665.911')] [2024-12-13 06:18:15,372][03213] Saving new best policy, reward=5665.911! [2024-12-13 06:18:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9637888. Throughput: 0: 1073.4. Samples: 9640448. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:18:20,371][03180] Avg episode reward: [(0, '5662.971')] [2024-12-13 06:18:25,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9646080. Throughput: 0: 1073.7. Samples: 9644400. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:18:25,371][03180] Avg episode reward: [(0, '5652.720')] [2024-12-13 06:18:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018840_9646080.pth... [2024-12-13 06:18:25,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018768_9609216.pth [2024-12-13 06:18:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9650176. Throughput: 0: 1119.1. Samples: 9652008. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:18:30,371][03180] Avg episode reward: [(0, '5680.787')] [2024-12-13 06:18:30,372][03213] Saving new best policy, reward=5680.787! [2024-12-13 06:18:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9654272. Throughput: 0: 1141.9. Samples: 9657640. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:18:35,371][03180] Avg episode reward: [(0, '5686.456')] [2024-12-13 06:18:35,372][03213] Saving new best policy, reward=5686.456! [2024-12-13 06:18:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9662464. Throughput: 0: 1135.0. Samples: 9661284. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:18:40,371][03180] Avg episode reward: [(0, '5718.597')] [2024-12-13 06:18:40,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018872_9662464.pth... [2024-12-13 06:18:40,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018808_9629696.pth [2024-12-13 06:18:40,385][03213] Saving new best policy, reward=5718.597! [2024-12-13 06:18:42,141][03226] Updated weights for policy 0, policy_version 18880 (0.0009) [2024-12-13 06:18:45,373][03180] Fps is (10 sec: 1638.0, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9670656. Throughput: 0: 1138.5. Samples: 9669036. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:18:45,374][03180] Avg episode reward: [(0, '5755.577')] [2024-12-13 06:18:45,375][03213] Saving new best policy, reward=5755.577! [2024-12-13 06:18:50,382][03180] Fps is (10 sec: 1227.5, 60 sec: 1160.3, 300 sec: 1138.5). Total num frames: 9674752. Throughput: 0: 1144.6. Samples: 9674784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:18:50,382][03180] Avg episode reward: [(0, '5756.301')] [2024-12-13 06:18:50,386][03213] Saving new best policy, reward=5756.301! [2024-12-13 06:18:55,371][03180] Fps is (10 sec: 819.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9678848. Throughput: 0: 1136.5. Samples: 9678404. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:18:55,371][03180] Avg episode reward: [(0, '5791.364')] [2024-12-13 06:18:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018904_9678848.pth... [2024-12-13 06:18:55,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018840_9646080.pth [2024-12-13 06:18:55,384][03213] Saving new best policy, reward=5791.364! [2024-12-13 06:19:00,373][03180] Fps is (10 sec: 1229.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9687040. Throughput: 0: 1136.3. Samples: 9686120. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:19:00,374][03180] Avg episode reward: [(0, '5787.732')] [2024-12-13 06:19:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9691136. Throughput: 0: 1145.2. Samples: 9691984. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:19:05,371][03180] Avg episode reward: [(0, '5793.199')] [2024-12-13 06:19:05,378][03213] Saving new best policy, reward=5793.199! [2024-12-13 06:19:10,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9695232. Throughput: 0: 1129.6. Samples: 9695232. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:19:10,371][03180] Avg episode reward: [(0, '5791.482')] [2024-12-13 06:19:10,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018936_9695232.pth... [2024-12-13 06:19:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018872_9662464.pth [2024-12-13 06:19:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9703424. Throughput: 0: 1134.2. Samples: 9703048. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:19:15,371][03180] Avg episode reward: [(0, '5793.007')] [2024-12-13 06:19:18,316][03226] Updated weights for policy 0, policy_version 18960 (0.0009) [2024-12-13 06:19:20,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9707520. Throughput: 0: 1142.4. Samples: 9709052. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:19:20,373][03180] Avg episode reward: [(0, '5791.114')] [2024-12-13 06:19:25,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9711616. Throughput: 0: 1122.0. Samples: 9711772. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:19:25,371][03180] Avg episode reward: [(0, '5765.816')] [2024-12-13 06:19:25,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000018968_9711616.pth... [2024-12-13 06:19:25,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018904_9678848.pth [2024-12-13 06:19:30,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9719808. Throughput: 0: 1122.6. Samples: 9719548. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:19:30,371][03180] Avg episode reward: [(0, '5772.453')] [2024-12-13 06:19:35,372][03180] Fps is (10 sec: 1228.6, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9723904. Throughput: 0: 1140.8. Samples: 9726108. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:19:35,373][03180] Avg episode reward: [(0, '5749.685')] [2024-12-13 06:19:40,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9728000. Throughput: 0: 1117.1. Samples: 9728672. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:19:40,371][03180] Avg episode reward: [(0, '5668.548')] [2024-12-13 06:19:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019000_9728000.pth... [2024-12-13 06:19:40,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018936_9695232.pth [2024-12-13 06:19:45,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9736192. Throughput: 0: 1118.3. Samples: 9736440. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:19:45,371][03180] Avg episode reward: [(0, '5664.417')] [2024-12-13 06:19:50,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.5, 300 sec: 1124.7). Total num frames: 9740288. Throughput: 0: 1138.5. Samples: 9743216. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:19:50,372][03180] Avg episode reward: [(0, '5677.121')] [2024-12-13 06:19:55,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9744384. Throughput: 0: 1118.8. Samples: 9745580. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:19:55,371][03180] Avg episode reward: [(0, '5588.709')] [2024-12-13 06:19:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019032_9744384.pth... [2024-12-13 06:19:55,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000018968_9711616.pth [2024-12-13 06:19:55,477][03226] Updated weights for policy 0, policy_version 19040 (0.0009) [2024-12-13 06:20:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9752576. Throughput: 0: 1110.2. Samples: 9753008. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:20:00,371][03180] Avg episode reward: [(0, '5565.558')] [2024-12-13 06:20:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9756672. Throughput: 0: 1132.6. Samples: 9760016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:20:05,371][03180] Avg episode reward: [(0, '5558.025')] [2024-12-13 06:20:10,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9764864. Throughput: 0: 1123.3. Samples: 9762320. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:20:10,371][03180] Avg episode reward: [(0, '5553.881')] [2024-12-13 06:20:10,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019072_9764864.pth... [2024-12-13 06:20:10,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019000_9728000.pth [2024-12-13 06:20:15,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9768960. Throughput: 0: 1112.0. Samples: 9769588. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:20:15,372][03180] Avg episode reward: [(0, '5573.469')] [2024-12-13 06:20:20,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9773056. Throughput: 0: 1129.3. Samples: 9776924. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:20:20,371][03180] Avg episode reward: [(0, '5547.838')] [2024-12-13 06:20:25,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9781248. Throughput: 0: 1128.0. Samples: 9779436. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:20:25,374][03180] Avg episode reward: [(0, '5543.389')] [2024-12-13 06:20:25,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019104_9781248.pth... [2024-12-13 06:20:25,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019032_9744384.pth [2024-12-13 06:20:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9785344. Throughput: 0: 1107.5. Samples: 9786276. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2024-12-13 06:20:30,371][03180] Avg episode reward: [(0, '5542.355')] [2024-12-13 06:20:31,571][03226] Updated weights for policy 0, policy_version 19120 (0.0009) [2024-12-13 06:20:35,371][03180] Fps is (10 sec: 1229.1, 60 sec: 1160.6, 300 sec: 1138.5). Total num frames: 9793536. Throughput: 0: 1121.5. Samples: 9793684. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:20:35,371][03180] Avg episode reward: [(0, '5499.970')] [2024-12-13 06:20:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9797632. Throughput: 0: 1126.1. Samples: 9796256. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:20:40,371][03180] Avg episode reward: [(0, '5543.092')] [2024-12-13 06:20:40,379][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019136_9797632.pth... [2024-12-13 06:20:40,386][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019072_9764864.pth [2024-12-13 06:20:45,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9801728. Throughput: 0: 1110.9. Samples: 9803000. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2024-12-13 06:20:45,371][03180] Avg episode reward: [(0, '5551.664')] [2024-12-13 06:20:50,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9809920. Throughput: 0: 1128.5. Samples: 9810800. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:20:50,374][03180] Avg episode reward: [(0, '5552.722')] [2024-12-13 06:20:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9814016. Throughput: 0: 1139.9. Samples: 9813616. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:20:55,371][03180] Avg episode reward: [(0, '5549.532')] [2024-12-13 06:20:55,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019168_9814016.pth... [2024-12-13 06:20:55,387][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019104_9781248.pth [2024-12-13 06:21:00,371][03180] Fps is (10 sec: 819.4, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9818112. Throughput: 0: 1121.2. Samples: 9820044. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2024-12-13 06:21:00,371][03180] Avg episode reward: [(0, '5546.449')] [2024-12-13 06:21:05,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9826304. Throughput: 0: 1133.0. Samples: 9827908. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:21:05,371][03180] Avg episode reward: [(0, '5586.603')] [2024-12-13 06:21:07,484][03226] Updated weights for policy 0, policy_version 19200 (0.0010) [2024-12-13 06:21:10,375][03180] Fps is (10 sec: 1228.2, 60 sec: 1092.2, 300 sec: 1124.6). Total num frames: 9830400. Throughput: 0: 1143.1. Samples: 9830876. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:21:10,376][03180] Avg episode reward: [(0, '5582.306')] [2024-12-13 06:21:10,388][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019200_9830400.pth... [2024-12-13 06:21:10,394][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019136_9797632.pth [2024-12-13 06:21:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9834496. Throughput: 0: 1128.1. Samples: 9837040. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:21:15,371][03180] Avg episode reward: [(0, '5545.132')] [2024-12-13 06:21:20,371][03180] Fps is (10 sec: 1229.4, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9842688. Throughput: 0: 1132.3. Samples: 9844636. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:21:20,371][03180] Avg episode reward: [(0, '5544.070')] [2024-12-13 06:21:25,376][03180] Fps is (10 sec: 1228.1, 60 sec: 1092.2, 300 sec: 1124.6). Total num frames: 9846784. Throughput: 0: 1151.1. Samples: 9848060. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2024-12-13 06:21:25,377][03180] Avg episode reward: [(0, '5615.373')] [2024-12-13 06:21:25,390][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019232_9846784.pth... [2024-12-13 06:21:25,403][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019168_9814016.pth [2024-12-13 06:21:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9854976. Throughput: 0: 1131.9. Samples: 9853936. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:21:30,371][03180] Avg episode reward: [(0, '5620.735')] [2024-12-13 06:21:35,371][03180] Fps is (10 sec: 1229.5, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9859072. Throughput: 0: 1131.1. Samples: 9861696. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:21:35,371][03180] Avg episode reward: [(0, '5630.193')] [2024-12-13 06:21:40,374][03180] Fps is (10 sec: 1228.4, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9867264. Throughput: 0: 1149.2. Samples: 9865332. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2024-12-13 06:21:40,375][03180] Avg episode reward: [(0, '5594.891')] [2024-12-13 06:21:40,383][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019272_9867264.pth... [2024-12-13 06:21:40,391][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019200_9830400.pth [2024-12-13 06:21:44,542][03226] Updated weights for policy 0, policy_version 19280 (0.0009) [2024-12-13 06:21:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9871360. Throughput: 0: 1128.6. Samples: 9870832. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:21:45,371][03180] Avg episode reward: [(0, '5623.034')] [2024-12-13 06:21:50,371][03180] Fps is (10 sec: 819.5, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9875456. Throughput: 0: 1127.2. Samples: 9878632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:21:50,371][03180] Avg episode reward: [(0, '5629.718')] [2024-12-13 06:21:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1138.5). Total num frames: 9883648. Throughput: 0: 1148.6. Samples: 9882556. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:21:55,373][03180] Avg episode reward: [(0, '5630.149')] [2024-12-13 06:21:55,378][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019304_9883648.pth... [2024-12-13 06:21:55,384][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019232_9846784.pth [2024-12-13 06:22:00,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9887744. Throughput: 0: 1125.0. Samples: 9887664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:22:00,371][03180] Avg episode reward: [(0, '5628.492')] [2024-12-13 06:22:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9891840. Throughput: 0: 1074.0. Samples: 9892964. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:22:05,371][03180] Avg episode reward: [(0, '5658.870')] [2024-12-13 06:22:10,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.4, 300 sec: 1110.8). Total num frames: 9895936. Throughput: 0: 1085.6. Samples: 9896904. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:22:10,371][03180] Avg episode reward: [(0, '5653.516')] [2024-12-13 06:22:10,375][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019328_9895936.pth... [2024-12-13 06:22:10,383][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019272_9867264.pth [2024-12-13 06:22:15,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9900032. Throughput: 0: 1082.0. Samples: 9902628. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:22:15,371][03180] Avg episode reward: [(0, '5655.719')] [2024-12-13 06:22:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9908224. Throughput: 0: 1073.2. Samples: 9909992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:22:20,371][03180] Avg episode reward: [(0, '5668.556')] [2024-12-13 06:22:21,917][03226] Updated weights for policy 0, policy_version 19360 (0.0009) [2024-12-13 06:22:25,371][03180] Fps is (10 sec: 1638.3, 60 sec: 1160.6, 300 sec: 1124.7). Total num frames: 9916416. Throughput: 0: 1077.0. Samples: 9913792. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-12-13 06:22:25,372][03180] Avg episode reward: [(0, '5676.073')] [2024-12-13 06:22:25,380][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019368_9916416.pth... [2024-12-13 06:22:25,385][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019304_9883648.pth [2024-12-13 06:22:30,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9920512. Throughput: 0: 1094.5. Samples: 9920084. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:22:30,371][03180] Avg episode reward: [(0, '5700.620')] [2024-12-13 06:22:35,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9924608. Throughput: 0: 1073.3. Samples: 9926932. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:22:35,371][03180] Avg episode reward: [(0, '5682.153')] [2024-12-13 06:22:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9932800. Throughput: 0: 1070.6. Samples: 9930732. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2024-12-13 06:22:40,371][03180] Avg episode reward: [(0, '5711.873')] [2024-12-13 06:22:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019400_9932800.pth... [2024-12-13 06:22:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019328_9895936.pth [2024-12-13 06:22:45,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9936896. Throughput: 0: 1098.4. Samples: 9937092. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:22:45,371][03180] Avg episode reward: [(0, '5715.258')] [2024-12-13 06:22:50,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9940992. Throughput: 0: 1125.8. Samples: 9943624. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:22:50,371][03180] Avg episode reward: [(0, '5688.367')] [2024-12-13 06:22:55,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9949184. Throughput: 0: 1122.1. Samples: 9947400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:22:55,372][03180] Avg episode reward: [(0, '5679.933')] [2024-12-13 06:22:55,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019432_9949184.pth... [2024-12-13 06:22:55,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019368_9916416.pth [2024-12-13 06:22:58,145][03226] Updated weights for policy 0, policy_version 19440 (0.0010) [2024-12-13 06:23:00,371][03180] Fps is (10 sec: 1228.7, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9953280. Throughput: 0: 1142.8. Samples: 9954056. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:00,372][03180] Avg episode reward: [(0, '5660.351')] [2024-12-13 06:23:05,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9957376. Throughput: 0: 1122.8. Samples: 9960516. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:05,371][03180] Avg episode reward: [(0, '5665.844')] [2024-12-13 06:23:10,371][03180] Fps is (10 sec: 1228.9, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9965568. Throughput: 0: 1124.0. Samples: 9964372. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:10,371][03180] Avg episode reward: [(0, '5670.786')] [2024-12-13 06:23:10,377][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019464_9965568.pth... [2024-12-13 06:23:10,381][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019400_9932800.pth [2024-12-13 06:23:15,373][03180] Fps is (10 sec: 1228.5, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9969664. Throughput: 0: 1145.1. Samples: 9971616. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:15,373][03180] Avg episode reward: [(0, '5707.609')] [2024-12-13 06:23:20,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9977856. Throughput: 0: 1124.9. Samples: 9977552. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:20,371][03180] Avg episode reward: [(0, '5684.681')] [2024-12-13 06:23:25,372][03180] Fps is (10 sec: 1228.9, 60 sec: 1092.2, 300 sec: 1124.7). Total num frames: 9981952. Throughput: 0: 1129.0. Samples: 9981540. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:25,375][03180] Avg episode reward: [(0, '5682.150')] [2024-12-13 06:23:25,381][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019496_9981952.pth... [2024-12-13 06:23:25,396][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019432_9949184.pth [2024-12-13 06:23:30,371][03180] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.7). Total num frames: 9986048. Throughput: 0: 1152.1. Samples: 9988936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:30,371][03180] Avg episode reward: [(0, '5681.089')] [2024-12-13 06:23:34,786][03226] Updated weights for policy 0, policy_version 19520 (0.0009) [2024-12-13 06:23:35,371][03180] Fps is (10 sec: 1229.0, 60 sec: 1160.5, 300 sec: 1124.7). Total num frames: 9994240. Throughput: 0: 1131.2. Samples: 9994528. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:35,371][03180] Avg episode reward: [(0, '5686.319')] [2024-12-13 06:23:40,371][03180] Fps is (10 sec: 1228.8, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9998336. Throughput: 0: 1133.3. Samples: 9998400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2024-12-13 06:23:40,371][03180] Avg episode reward: [(0, '5690.811')] [2024-12-13 06:23:40,376][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019528_9998336.pth... [2024-12-13 06:23:40,382][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019464_9965568.pth [2024-12-13 06:23:44,425][03213] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000 [2024-12-13 06:23:44,432][03180] Component Batcher_0 stopped! [2024-12-13 06:23:44,431][03213] Stopping Batcher_0... [2024-12-13 06:23:44,433][03213] Loop batcher_evt_loop terminating... [2024-12-13 06:23:44,435][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019544_10006528.pth... [2024-12-13 06:23:44,441][03213] Removing ./train_dir/Ant/checkpoint_p0/checkpoint_000019496_9981952.pth [2024-12-13 06:23:44,446][03213] Saving ./train_dir/Ant/checkpoint_p0/checkpoint_000019544_10006528.pth... [2024-12-13 06:23:44,447][03180] Component RolloutWorker_w5 stopped! [2024-12-13 06:23:44,452][03232] Stopping RolloutWorker_w6... [2024-12-13 06:23:44,453][03232] Loop rollout_proc6_evt_loop terminating... [2024-12-13 06:23:44,454][03228] Stopping RolloutWorker_w2... [2024-12-13 06:23:44,455][03228] Loop rollout_proc2_evt_loop terminating... [2024-12-13 06:23:44,456][03227] Stopping RolloutWorker_w0... [2024-12-13 06:23:44,458][03180] Component RolloutWorker_w7 stopped! [2024-12-13 06:23:44,458][03180] Component RolloutWorker_w1 stopped! [2024-12-13 06:23:44,460][03180] Component RolloutWorker_w4 stopped! [2024-12-13 06:23:44,460][03180] Component RolloutWorker_w6 stopped! [2024-12-13 06:23:44,460][03213] Stopping LearnerWorker_p0... [2024-12-13 06:23:44,460][03180] Component RolloutWorker_w2 stopped! [2024-12-13 06:23:44,461][03180] Component RolloutWorker_w0 stopped! [2024-12-13 06:23:44,461][03213] Loop learner_proc0_evt_loop terminating... [2024-12-13 06:23:44,451][03231] Stopping RolloutWorker_w4... [2024-12-13 06:23:44,461][03180] Component RolloutWorker_w3 stopped! [2024-12-13 06:23:44,464][03231] Loop rollout_proc4_evt_loop terminating... [2024-12-13 06:23:44,464][03180] Component LearnerWorker_p0 stopped! [2024-12-13 06:23:44,457][03229] Stopping RolloutWorker_w3... [2024-12-13 06:23:44,447][03233] Stopping RolloutWorker_w5... [2024-12-13 06:23:44,466][03227] Loop rollout_proc0_evt_loop terminating... [2024-12-13 06:23:44,448][03234] Stopping RolloutWorker_w7... [2024-12-13 06:23:44,449][03230] Stopping RolloutWorker_w1... [2024-12-13 06:23:44,465][03229] Loop rollout_proc3_evt_loop terminating... [2024-12-13 06:23:44,466][03233] Loop rollout_proc5_evt_loop terminating... [2024-12-13 06:23:44,472][03230] Loop rollout_proc1_evt_loop terminating... [2024-12-13 06:23:44,472][03234] Loop rollout_proc7_evt_loop terminating... [2024-12-13 06:23:44,593][03226] Weights refcount: 2 0 [2024-12-13 06:23:44,600][03180] Component InferenceWorker_p0-w0 stopped! [2024-12-13 06:23:44,599][03226] Stopping InferenceWorker_p0-w0... [2024-12-13 06:23:44,601][03180] Waiting for process learner_proc0 to stop... [2024-12-13 06:23:44,601][03226] Loop inference_proc0-0_evt_loop terminating... [2024-12-13 06:23:47,578][03180] Waiting for process inference_proc0-0 to join... [2024-12-13 06:23:47,845][03180] Waiting for process rollout_proc0 to join... [2024-12-13 06:23:55,683][03180] Waiting for process rollout_proc1 to join... [2024-12-13 06:23:55,692][03180] Waiting for process rollout_proc2 to join... [2024-12-13 06:23:55,754][03180] Waiting for process rollout_proc3 to join... [2024-12-13 06:23:55,759][03180] Waiting for process rollout_proc4 to join... [2024-12-13 06:23:55,765][03180] Waiting for process rollout_proc5 to join... [2024-12-13 06:23:55,769][03180] Waiting for process rollout_proc6 to join... [2024-12-13 06:23:55,773][03180] Waiting for process rollout_proc7 to join... [2024-12-13 06:23:55,780][03180] Batcher 0 profile tree view: batching: 5.5700, releasing_batches: 2.6012 [2024-12-13 06:23:55,783][03180] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0052 wait_policy_total: 5985.0005 update_model: 87.8432 weight_update: 0.0009 one_step: 0.0024 handle_policy_step: 2626.9417 deserialize: 91.3028, stack: 29.4689, obs_to_device_normalize: 507.5274, forward: 1345.0149, send_messages: 177.2450 prepare_outputs: 268.0433 to_cpu: 36.3204 [2024-12-13 06:23:55,784][03180] Learner 0 profile tree view: misc: 0.0119, prepare_batch: 19.7988 train: 177.9921 epoch_init: 0.0834, minibatch_init: 2.6211, losses_postprocess: 2.9585, kl_divergence: 0.9343, after_optimizer: 3.3015 calculate_losses: 66.9323 losses_init: 0.0860, forward_head: 19.5999, bptt_initial: 0.4105, bptt: 0.4688, tail: 22.1018, advantages_returns: 1.9457, losses: 19.2336 update: 97.5549 clip: 10.8150 [2024-12-13 06:23:55,785][03180] RolloutWorker_w0 profile tree view: wait_for_trajectories: 1.7806, enqueue_policy_requests: 1402.6665, env_step: 5627.1902, overhead: 473.5378, complete_rollouts: 12.2542 save_policy_outputs: 343.7328 split_output_tensors: 136.2608 [2024-12-13 06:23:55,785][03180] RolloutWorker_w7 profile tree view: wait_for_trajectories: 1.5419, enqueue_policy_requests: 1411.8562, env_step: 5594.4321, overhead: 469.8633, complete_rollouts: 10.4404 save_policy_outputs: 347.5896 split_output_tensors: 137.7165 [2024-12-13 06:23:55,786][03180] Loop Runner_EvtLoop terminating... [2024-12-13 06:23:55,787][03180] Runner profile tree view: main_loop: 9086.8511 [2024-12-13 06:23:55,788][03180] Collected {0: 10006528}, FPS: 1101.2